A supervised data-driven approach for microarray spot quality classification

In this paper, the problem of classifying the quality of microarray data spots is addressed, using concepts derived from the supervised learning theory. The proposed method, after extracting spots from the microarray image, computes several features, which take into account shape, color and variability. The features are classified using support vector machines, a recent statistical classification technique that is being employed widely. The proposed method does not make any assumptions on the problem and does not require any a priori information. The proposed system has been tested in a real case, for several different parameters’ configurations. Experimental results show the effectiveness of the proposed approach, also in comparison with state-of-the-art methods.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[3]  X. Wang,et al.  Quantitative quality control in microarray image processing and data acquisition. , 2001, Nucleic acids research.

[4]  I. Jolliffe Principal Component Analysis , 2002 .

[5]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[6]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1998, Learning in Graphical Models.

[7]  Jens Michael Carstensen,et al.  Bayesian Grid Matching , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Jeremy Buhler,et al.  Dapple: Improved Techniques for Finding Spots on DNA Microarrays , 2000 .

[9]  Jiri Matas,et al.  Support vector machines for face authentication , 2002, Image Vis. Comput..

[10]  Manuele Bicego,et al.  Statistical classification of raw textile defects , 2004, ICPR 2004.

[11]  Lene Theil Skovgaard,et al.  Applied regression analysis. 3rd edn. N. R. Draper and H. Smith, Wiley, New York, 1998. No. of pages: xvii+706. Price: £45. ISBN 0‐471‐17082‐8 , 2000 .

[12]  Jaakko Astola,et al.  A novel strategy for microarray quality control using Bayesian networks , 2003, Bioinform..

[13]  P. Sorger,et al.  Image metrics in the statistical analysis of DNA microarray data , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Enrico Grosso,et al.  Probabilistic face authentication using hidden Markov models , 2005, SPIE Defense + Commercial Sensing.

[15]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[16]  Horst Bischof,et al.  Robust DNA microarray image analysis , 2003, Machine Vision and Applications.

[17]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[18]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[19]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[20]  Paolo Fiorini,et al.  Hybrid HMM/SVM model for the analysis and segmentation of teleoperation tasks , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[21]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[22]  Fabian Model,et al.  Statistical process control for large scale microarray experiments , 2002, ISMB.

[23]  Michael L. Bittner,et al.  Ratio statistics of gene expression levels and applications to microarray data analysis , 2002, Bioinform..

[24]  F. Valafar Pattern Recognition Techniques in Microarray Data Analysis : A Survey , 2002 .

[25]  N. Draper,et al.  Applied Regression Analysis: Draper/Applied Regression Analysis , 1998 .

[26]  N. Draper,et al.  Applied Regression Analysis , 1967 .

[27]  Werner Dubitzky,et al.  A Practical Approach to Microarray Data Analysis , 2003, Springer US.

[28]  F. Valafar Pattern Recognition Techniques in Microarray Data Analysis , 2002 .

[29]  Sayan Mukherjee,et al.  Classifying Microarray Data Using Support Vector Machines , 2003 .

[30]  Marek S. Skrzypek,et al.  YPDTM, PombePDTM and WormPDTM: model organism volumes of the BioKnowledgeTM Library, an integrated resource for protein information , 2001, Nucleic Acids Res..

[31]  Massimiliano Pontil,et al.  Support Vector Machines for 3D Object Recognition , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Manuele Bicego,et al.  Face recognition with Multilevel B-Splines and Support Vector Machines , 2003, WBMA '03.