Computer Engineering & Science >
An Approach to Sequence Recognition Based on the Similarity Learning of MultiDegree Distortion Subsequence
Received date: 2010-03-30
Revised date: 2010-06-28
Online published: 2011-02-25
In the domain of sequence recognition, sequences with the same label are not rigorously similar because of the influence of many factors. Using multiscale to measure the similarities between signature sequences is much helpful to obtaining highlyqualified similarity measures. A new method for sequence recognition based on distorted subsequence is put forward in this paper. A kernel function, which takes into account the distortions of various degrees, is defined on the feature space spanned by the distorted subsequences, and an efficient algorithm of linear cost is designed to compute the feature vectors with high dimensions. A combination of the kernel matrix with different distortions is learned and optimized through Semidefinite Program (SDP). Combining the optimized kernel with Support Vector Machine (SVM), a classifier with softer boundary that allows the most appropriate degree of distortions within the sequences is built. The experiments on the benchmark database of SCOP 1.37 PDB90 show that the classifier improves the recognition accuracy universally for most protein sequences in the 33 families of the benchmark database.
QIU Dehong,FANG Shaohong,SUN Lei . An Approach to Sequence Recognition Based on the Similarity Learning of MultiDegree Distortion Subsequence[J]. Computer Engineering & Science, 2011 , 33(2) : 153 -158 . DOI: 10.3969/j.issn.1007130X.2011.
/
| 〈 |
|
〉 |