A novel approach to knowledge discovery and representation in biological databases
dc.contributor.author | Lu, Jing | |
dc.contributor.author | Wang, Cuiqing | |
dc.contributor.author | Keech, Malcolm | |
dc.date.accessioned | 2020-09-11T10:04:19Z | |
dc.date.available | 2020-09-11T10:04:19Z | |
dc.date.issued | 2017-09-25 | |
dc.identifier.citation | Lu J, Wang C, Keech M (2017) 'A novel approach to knowledge discovery and representation in biological databases', International Journal of Bioinformatics Research and Applications, 13 (4), pp.352-375. | en_US |
dc.identifier.issn | 1744-5485 | |
dc.identifier.doi | 10.1504/IJBRA.2017.087384 | |
dc.identifier.uri | http://hdl.handle.net/10547/624500 | |
dc.description.abstract | Extraction of motifs from biological sequences is among the frontier research issues in bioinformatics, with sequential patterns mining becoming one of the most important computational techniques in this area. A number of applications motivate the search for more structured patterns and concurrent protein motif mining is considered here. This paper builds on the concept of structural relation patterns and applies the concurrent sequential patterns (ConSP) mining approach to biological databases. Specifically, an original method is presented using support vectors as the data structure for the extraction of novel patterns in protein sequences. Data modelling is pursued to represent the more interesting concurrent patterns visually. Experiments with real-world protein datasets from the UniProt and NCBI databases highlight the applicability of the ConSP methodology in protein data mining and modelling. The results show the potential for knowledge discovery in the field of protein structure identification. A pilot experiment extends the methodology to DNA sequences to indicate a future direction. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Inderscience | en_US |
dc.relation.url | http://www.inderscience.com/offer.php?id=87384 | en_US |
dc.rights | Yellow - can archive pre-print (ie pre-refereeing) | |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | bioinformatics | en_US |
dc.subject | data analytics | en_US |
dc.subject | structural relations | en_US |
dc.subject | biological databases | en_US |
dc.subject | concurrent vector method | en_US |
dc.subject | graphical modeling | en_US |
dc.subject | protein motif mining | en_US |
dc.subject | sequential patterns post-processing | en_US |
dc.subject | knowledge discovery | en_US |
dc.title | A novel approach to knowledge discovery and representation in biological databases | en_US |
dc.type | Article | en_US |
dc.identifier.eissn | 1744-5493 | |
dc.identifier.journal | International Journal of Bioinformatics Research and Applications | en_US |
dc.date.updated | 2020-09-11T09:55:00Z | |
dc.description.note | ||
refterms.dateFOA | 2020-09-11T10:04:19Z |