2.50
Hdl Handle:
http://hdl.handle.net/10547/334492
Title:
Protein data modelling for concurrent sequential patterns
Authors:
Lu, Jing; Keech, Malcolm; Wang, Cuiqing
Abstract:
Protein sequences from the same family typically share common patterns which imply their structural function and biological relationship. The challenge of identifying protein motifs is often addressed through mining frequent itemsets and sequential patterns, where post-processing is a useful technique. Earlier work has shown that Concurrent Sequential Patterns mining can be applied in bioinformatics, e.g. to detect frequently occurring concurrent protein sub-sequences. This paper presents a companion approach to data modelling and visualisation, applying it to real-world protein datasets from the PROSITE and NCBI databases. The results show the potential for graph-based modelling in representing the integration of higher level patterns common to all or nearly all of the protein sequences.
Affiliation:
University of Bedfordshire
Citation:
Lu, J., Keech, M., Wang, C., (2014) 'Protein Data Modelling for Concurrent Sequential Patterns' 5th International Workshop on Biological Knowledge Discovery and Data Mining, Munich 3rd September.
Publisher:
DEXA
Issue Date:
Sep-2014
URI:
http://hdl.handle.net/10547/334492
Additional Links:
http://www.dexa.org/previous/dexa2014/ws_program387a.html?cid=439
Type:
Conference papers, meetings and proceedings
Language:
en
Appears in Collections:
Centre for Research in Distributed Technologies (CREDIT)

Full metadata record

DC FieldValue Language
dc.contributor.authorLu, Jingen
dc.contributor.authorKeech, Malcolmen
dc.contributor.authorWang, Cuiqingen
dc.date.accessioned2014-11-11T12:46:35Z-
dc.date.available2014-11-11T12:46:35Z-
dc.date.issued2014-09-
dc.identifier.citationLu, J., Keech, M., Wang, C., (2014) 'Protein Data Modelling for Concurrent Sequential Patterns' 5th International Workshop on Biological Knowledge Discovery and Data Mining, Munich 3rd September.en
dc.identifier.urihttp://hdl.handle.net/10547/334492-
dc.description.abstractProtein sequences from the same family typically share common patterns which imply their structural function and biological relationship. The challenge of identifying protein motifs is often addressed through mining frequent itemsets and sequential patterns, where post-processing is a useful technique. Earlier work has shown that Concurrent Sequential Patterns mining can be applied in bioinformatics, e.g. to detect frequently occurring concurrent protein sub-sequences. This paper presents a companion approach to data modelling and visualisation, applying it to real-world protein datasets from the PROSITE and NCBI databases. The results show the potential for graph-based modelling in representing the integration of higher level patterns common to all or nearly all of the protein sequences.en
dc.language.isoenen
dc.publisherDEXAen
dc.relation.urlhttp://www.dexa.org/previous/dexa2014/ws_program387a.html?cid=439en
dc.subjectprotein sequencesen
dc.subjectdata miningen
dc.subjectconcurrent sequential patterns (ConSP)en
dc.subjectbioinformaticsen
dc.subjectConSP modellingen
dc.subjectbiological databasesen
dc.subjectknowledge representationen
dc.subjectvisualizationen
dc.titleProtein data modelling for concurrent sequential patternsen
dc.typeConference papers, meetings and proceedingsen
dc.contributor.departmentUniversity of Bedfordshireen
All Items in UOBREP are protected by copyright, with all rights reserved, unless otherwise indicated.