Advanced machine learning techniques for microarray spot quality classification

It is well known that microarray printing, hybridization, and washing oftentimes create erroneous measurements, and these errors detrimentally impact machine microarray spot quality classification. Thus, it is crucial to identify and remove these errors if automation is to replace the still common practice of visually assessing spot quality, an extremely expensive and time-consuming procedure. A major problem in microarray spot quality classification methods proposed in the literature is the correlation among the features extracted from the spots. In this paper, we propose using a random subspace ensemble of neural networks and a feature selection algorithm to improve the performance of our microarray spot quality classification method. Our best method obtains an error under the receiver operating characteristic curve (EAUR) of 0.3 outperforming the stand-alone support vector machine EAUR of 1.7. The consistency of our proposed approach makes it a viable alternative to the labour-intensive manual method of spot quality assessment.

[1]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Loris Nanni,et al.  Ensemble of Parzen window classifiers for on-line signature verification , 2005, Neurocomputing.

[3]  Fabian Model,et al.  Statistical process control for large scale microarray experiments , 2002, ISMB.

[4]  Loris Nanni,et al.  Cluster-based pattern discrimination: A novel technique for feature selection , 2006, Pattern Recognit. Lett..

[5]  Khaled S. Ahmed,et al.  Estimating Protein Functions Correlation Based on Overlapping Proteins and Cluster Interactions , 2012 .

[6]  Loris Nanni,et al.  Ensemblator: An ensemble of classifiers for reliable classification of biological data , 2007, Pattern Recognit. Lett..

[7]  Manuele Bicego,et al.  A supervised data-driven approach for microarray spot quality classification , 2005, Pattern Analysis and Applications.

[8]  P. Sorger,et al.  Image metrics in the statistical analysis of DNA microarray data , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Loris Nanni,et al.  Introduction to Neonatal Facial Pain Detection Using Common and Advanced Face Classification Techniques , 2007, Advanced Computational Intelligence Paradigms in Healthcare.

[10]  Lei Huang,et al.  A SUPPORT VECTOR MACHINE APPROACH FOR PREDICTION OF T CELL EPITOPES , 2005 .

[11]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[12]  Jaakko Astola,et al.  A novel strategy for microarray quality control using Bayesian networks , 2003, Bioinform..

[13]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Loris Nanni,et al.  FuzzyBagging: A novel ensemble of classifiers , 2006, Pattern Recognit..

[15]  Michael L. Bittner,et al.  Ratio statistics of gene expression levels and applications to microarray data analysis , 2002, Bioinform..

[16]  Jaakko Hollmén,et al.  Image Analysis for Detecting Faulty Spots from Microarray Images , 2002, Discovery Science.

[17]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[18]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[19]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[20]  X. Wang,et al.  Quantitative quality control in microarray image processing and data acquisition. , 2001, Nucleic acids research.

[21]  Daniel Eriksson,et al.  MASQOT: a method for cDNA microarray spot quality control , 2005, BMC Bioinformatics.