Fads and fallacies in the name of small-sample microarray classification - A highlight of misunderstanding and erroneous usage in the applications of genomic signal processing

The purpose of this article is to highlight those topics that are believed to have displayed a record of misunderstanding and erroneous usage in the applications of genomic signal processing (GSP). We focus on the related subjects of feature selection, classifier design, and error estimation, which together form a microarray classification pipeline, discussing some of the most common "fads" and "fallacies" regarding classification methods that are routinely applied in the analysis of small-sample microarray data

[1]  Ulisses Braga-Neto,et al.  Exact performance of error estimators for discrete classifiers , 2005, Pattern Recognit..

[2]  Eric B. Baum,et al.  On the capabilities of multilayer perceptrons , 1988, J. Complex..

[3]  Ulisses Braga-Neto,et al.  Bolstered error estimation , 2004, Pattern Recognit..

[4]  E. Dougherty,et al.  Identification of combination gene sets for glioma classification. , 2002, Molecular cancer therapeutics.

[5]  Yudong D. He,et al.  A Gene-Expression Signature as a Predictor of Survival in Breast Cancer , 2002 .

[6]  B. Gebhard Fads and Fallacies in the Name of Science , 1958 .

[7]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[8]  D. J. Hand,et al.  Recent advances in error rate estimation , 1986, Pattern Recognit. Lett..

[9]  Edward R. Dougherty,et al.  Small Sample Issues for Microarray-Based Classification , 2001, Comparative and functional genomics.

[10]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[11]  Van,et al.  A gene-expression signature as a predictor of survival in breast cancer. , 2002, The New England journal of medicine.

[12]  Stefan Michiels,et al.  Prediction of cancer outcome with microarrays: a multiple random validation strategy , 2005, The Lancet.

[13]  J. Ioannidis Microarrays and molecular research: noise discovery? , 2005, The Lancet.

[14]  J. William Ahwood,et al.  CLASSIFICATION , 1931, Foundations of Familiar Language.

[15]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Jan M. Van Campenhout,et al.  On the Possible Orderings in the Measurement Selection Problem , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[17]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[18]  Edward R. Dougherty,et al.  Is cross-validation valid for small-sample microarray classification? , 2004, Bioinform..

[19]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[20]  G. F. Hughes,et al.  On the mean accuracy of statistical pattern recognizers , 1968, IEEE Trans. Inf. Theory.

[21]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[22]  M. Radmacher,et al.  Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. , 2003, Journal of the National Cancer Institute.

[23]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..