Predicting single nucleotide polymorphisms (SNP) from DNA sequence by support vector machine.

Recently, SNP has gained substantial attention as genetic markers and is recognized as a key element in the development of personalized medicine. Computational prediction of SNP can be used as a guide for SNP discovery to reduce the cost and time needed for the development of personalized medicine. We have developed a method for SNP prediction based on support vector machines (SVMs) using different features extracted from the SNP data. Prediction rates of 60.9% was achieved by sequence feature, 59.1% by free-energy feature, 58.1% by GC content feature, 58.0% by melting temperature feature, 56.2% by enthalpy feature, 55.1% by entropy feature and 54.3% by the gene, exon and intron feature. We introduced a new feature, the SNP distribution score that achieved a prediction rate of 77.3%. Thus, the proposed SNP prediction algorithm can be used to in SNP discovery.