A new SVM model for classifying genetic data

We propose a new formulation of the Support Vector Machine (SVM) for classifying genetic data. It is based on the development of ideas from the method of total least squares, in which assumed error in measured data are incorporated in the model design. For genetic data the number of features is always far greater than the sample size. Consequently, in our method, we introduce Lagrange multipliers and solve for the dual variables. Instead of finding the minimum value of the Lagrangian function, we solve the nonlinear system of equations obtained from the Karush-Kuhn-Tucker conditions. We also implement complementarity constraints and incorporate weighting of the linear system by the inverse covariance matrix of the measured data. The proposed algorithm gives improved results and higher sensitivity for classifying a set of Alzheimer’s Disease Positron Emission Tomography images as compared with SVM. It is also more robust to noise than SVM.

[1]  Nello Cristianini,et al.  Support vector machine classification and validation of cancer tissue samples using microarray expression data , 2000, Bioinform..

[2]  Tommi S. Jaakkola,et al.  Feature Selection and Dualities in Maximum Entropy Discrimination , 2000, UAI.

[3]  Ricardo D. Fierro,et al.  The Total Least Squares Problem: Computational Aspects and Analysis (S. Van Huffel and J. Vandewalle) , 1993, SIAM Rev..

[4]  Bernhard Schölkopf,et al.  Statistical Learning and Kernel Methods , 2001, Data Fusion and Perception.

[5]  John C. Platt Using Analytic QP and Sparseness to Speed Training of Support Vector Machines , 1998, NIPS.

[6]  Sabine Van Huffel,et al.  Total least squares problem - computational aspects and analysis , 1991, Frontiers in applied mathematics.

[7]  Marko Grobelnik,et al.  Training text classifiers with SVM on very few positive examples , 2003 .

[8]  Lipo Wang Support vector machines : theory and applications , 2005 .

[9]  Christos Davatzikos,et al.  Feature selection and classification of multiparametric medical images using bagging and SVM , 2008, SPIE Medical Imaging.

[10]  Sabine Van Huffel,et al.  Recent advances in total least squares techniques and errors-in-variables modeling , 1997 .

[11]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[12]  H. Engl,et al.  Regularization of Inverse Problems , 1996 .

[13]  I. Mian,et al.  Identifying marker genes in transcription profiling data using a mixture of feature relevance experts. , 2001, Physiological genomics.

[14]  C. Kelley Iterative Methods for Linear and Nonlinear Equations , 1987 .

[15]  Johan A. K. Suykens,et al.  Least squares support vector machine classifiers: a large scale algorithm , 1999 .

[16]  Lipo Wang,et al.  Support Vector Machines: Theory and Applications (Studies in Fuzziness and Soft Computing) , 2005 .

[17]  Sayan Mukherjee,et al.  Classifying Microarray Data Using Support Vector Machines , 2003 .

[18]  Rosemary A. Renaut,et al.  Efficient Algorithms for Solution of Regularized Total Least Squares , 2005, SIAM J. Matrix Anal. Appl..

[19]  Michael T. Heath,et al.  Scientific Computing: An Introductory Survey , 1996 .

[20]  Paul S. Bradley,et al.  Feature Selection via Concave Minimization and Support Vector Machines , 1998, ICML.

[21]  Yanqing Zhang,et al.  Genetic fuzzy classification fusion of multiple SVMs for biomedical data , 2007, J. Intell. Fuzzy Syst..

[22]  C. Jack,et al.  Alzheimer's Disease Neuroimaging Initiative , 2008 .

[23]  Sayan Mukherjee,et al.  Feature Selection for SVMs , 2000, NIPS.

[24]  G. Rees Statistical Parametric Mapping , 2004, Practical Neurology.

[25]  S. Gunn Support Vector Machines for Classification and Regression , 1998 .

[26]  David K. Smith,et al.  Mathematical Programming: Theory and Algorithms , 1986 .

[27]  Shigeo Abe,et al.  Support Vector Machines for Pattern Classification (Advances in Pattern Recognition) , 2005 .

[28]  Vojislav Kecman,et al.  Support Vector Machines – An Introduction , 2005 .

[29]  Shigeo Abe Support Vector Machines for Pattern Classification , 2010, Advances in Pattern Recognition.

[30]  Jill P. Mesirov,et al.  Support Vector Machine Classification of Microarray Data , 2001 .

[31]  C. Jack,et al.  Alzheimer's Disease Neuroimaging Initiative , 2008 .

[32]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.