A new method for classification in DNA sequence

As an important part of biological sequence data, DNA sequence determines the type and function of the DNA. Wiping off the independent random background in the process of DNA sequence feature extraction to solve the repeated computation in information extraction, then using the k-means method to cluster the dataset, and SVM algorithm in classification respectively, the method in this paper finally determines the final result according to voting, and experiment results show that the algorithm has better search efficiency, and can get better research results.

[1]  Qingshan Jiang,et al.  A New Model for Finding Approximate Tandem Repeats in DNA Sequences , 2011, J. Softw..

[2]  David R. Gilbert,et al.  Approaches to the Automatic Discovery of Patterns in Biosequences , 1998, J. Comput. Biol..

[3]  Wang Guoren SUA-Based Algorithm for Finding SATRs in DNA Sequence , 2007 .

[4]  Wei You,et al.  Classification of DNA Sequences Basing on the Dinucleotide Compositions , 2009, 2009 Second International Symposium on Computational Intelligence and Design.

[5]  Zhu Yangyong,et al.  BioPM:An Efficient Algorithm for Protein Motif Mining , 2007, 2007 1st International Conference on Bioinformatics and Biomedical Engineering.

[6]  Qingshan Jiang,et al.  An efficient algorithm for protein sequence pattern mining , 2010, 2010 5th International Conference on Computer Science & Education.

[7]  Michael J. Conroy,et al.  Statistics and Decision Making , 2013 .

[8]  I D Campbell,et al.  Protein modules. , 1991, Trends in biochemical sciences.