Breast Cancer Diagnosis from Proteomic Mass Spectrometry Data: A Comparative Evaluation

The performance results of a wide range of different classifiers applied to proteomic mass spectra data, in a blind comparative assessment organised by Bart Mertens, are reviewed. The different approaches are summarised, issues of how to evaluate and compare the predictions are described, and the results of the different methods are examined. Although the different methods perform differently, their rank ordering varies according to how one measures performance, so that one cannot draw unequivocal conclusions about which is 'best.' Instead, it is clear that what matters is not the method by itself, but the interaction of method and user - the degree of sophistication of the user with a method. Nevertheless, such competitions do serve the useful role of setting (constantly improving) baselines against which new researchers can pit their wits and methods, as well as providing standards against which new methods should be assessed.

[1]  David A. Cairns,et al.  Application of the Random Forest Classification Method to Peaks Detected from Mass Spectrometric Proteomic Profiles of Cancer Patients and Controls , 2008, Statistical applications in genetics and molecular biology.

[2]  David J. Hand,et al.  Construction and Assessment of Classification Rules , 1997 .

[3]  Robert P. W. Duin,et al.  A note on comparing classifiers , 1996, Pattern Recognit. Lett..

[4]  Lin Dan,et al.  A cross-validation study to select a classification procedure for clinical diagnosis based on proteomic mass spectrometry. , 2008 .

[5]  D Servan-Schreiber,et al.  Artificial Intelligence and Psychiatry , 1986, The Journal of nervous and mental disease.

[6]  Age K. Smilde,et al.  A Classification Model for the Leiden Proteomics Competition , 2008, Statistical applications in genetics and molecular biology.

[7]  David J. Hand,et al.  Mining Supervised Classification Performance Studies: A Meta-Analytic Investigation , 2008, J. Classif..

[8]  W. G. Hunter,et al.  The Experimental Study of Physical Mechanisms , 1965 .

[9]  Bart Mertens Organizing a Competition on Clinical Mass Spectrometry Based Proteomic Diagnosis , 2008, Statistical applications in genetics and molecular biology.

[10]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[11]  Philip J Brown,et al.  Empirical Bayes Logistic Regression , 2008, Statistical applications in genetics and molecular biology.

[12]  Nico Nagelkerke,et al.  Developing a Discrimination Rule between Breast Cancer Patients and Controls Using Proteomics Mass Spectrometric Data: A Three-Step Approach , 2008, Statistical applications in genetics and molecular biology.

[13]  Simon Price Mining the past to determine the future: Comments , 2009 .

[14]  A. Gammerman,et al.  Statistical Applications in Genetics and Molecular Biology , 2011 .

[15]  Mark A van de Wiel,et al.  Support Vector Machine Approach to Separate Control and Breast Cancer Serum Samples , 2008, Statistical applications in genetics and molecular biology.

[16]  Jelle J Goeman,et al.  Autocorrelated Logistic Ridge Regression for Prediction Based on Proteomics Spectra , 2008, Statistical applications in genetics and molecular biology.

[17]  Somnath Datta,et al.  Classification of Breast Cancer versus Normal Samples from Mass Spectrometry Profiles Using Linear Discriminant Analysis of Important Features Selected by Random Forest , 2008, Statistical applications in genetics and molecular biology.

[18]  David J. Hand,et al.  Classifier Technology and the Illusion of Progress , 2006, math/0606441.

[19]  David J. Hand,et al.  Adjusted estimation for the combination of classifiers , 1999, Intell. Data Anal..

[20]  G. Garrido Cantarero,et al.  [The area under the ROC curve]. , 1996, Medicina clinica.

[21]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[22]  Steven Salzberg,et al.  On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach , 1997, Data Mining and Knowledge Discovery.

[23]  Theofanis Sapatinas,et al.  Discriminant Analysis and Statistical Pattern Recognition , 2005 .

[24]  Bart Mertens,et al.  Case-Control Breast Cancer Study of MALDI-TOF Proteomic Mass Spectrometry Data on Serum Samples , 2008, Statistical applications in genetics and molecular biology.

[25]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[26]  David R. Cox,et al.  Role of Models in Statistical Analysis , 1990 .

[27]  Tom Fearn,et al.  Statistical Applications in Genetics and Molecular Biology , 2011 .

[28]  E. Lehmann Model Specification: The Views of Fisher and Neyman, and Later Developments , 1990 .