Computational diagnostics with gene expression profiles.

Gene expression profiling using micro-arrays is a modern approach for molecular diagnostics. In clinical micro-array studies, researchers aim to predict disease type, survival, or treatment response using gene expression profiles. In this process, they encounter a series of obstacles and pitfalls. This chapter reviews fundamental issues from machine learning and recommends a procedure for the computational aspects of a clinical micro-array study.

[1]  Seymour Geisser,et al.  The Predictive Sample Reuse Method with Applications , 1975 .

[2]  J. Downing,et al.  Treatment-specific changes in gene expression discriminate in vivo drug response in human leukemia cells , 2003, Nature Genetics.

[3]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[4]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[5]  Yudong D. He,et al.  A Gene-Expression Signature as a Predictor of Survival in Breast Cancer , 2002 .

[6]  Cor J. Veenman,et al.  A protocol for building and evaluating predictors of disease state based on microarray data , 2005, Bioinform..

[7]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[8]  David G. Stork,et al.  Pattern Classification , 1973 .

[9]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[10]  J. Ioannidis,et al.  Predictive ability of DNA microarrays for cancer outcomes and correlates: an empirical assessment , 2003, The Lancet.

[11]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[12]  Jason Weston,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2002, Machine Learning.

[13]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[14]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[15]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[16]  Geoffrey J McLachlan,et al.  Selection bias in gene extraction on the basis of microarray gene-expression data , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Rainer Spang,et al.  Early Diagnostic Marker Panel Determination for Microarray Based Clinical Studies , 2005, Statistical applications in genetics and molecular biology.

[18]  E. Lander,et al.  A molecular signature of metastasis in primary solid tumors , 2003, Nature Genetics.

[19]  Wei Pan,et al.  Linear regression and two-class classification with gene expression data , 2003, Bioinform..

[20]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[21]  Huiqing Liu,et al.  Use of extreme patient samples for outcome prediction from gene expression data , 2005, Bioinform..

[22]  László Györfi,et al.  A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[23]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[24]  Philip Lijnzaad,et al.  An expression profile for diagnosis of lymph node metastases from primary head and neck squamous cell carcinomas , 2005, Nature Genetics.

[25]  Eytan Domany,et al.  Outcome signature genes in breast cancer: is there a unique set? , 2004, Breast Cancer Research.

[26]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[27]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[28]  R. Tibshirani,et al.  Repeated observation of breast tumor subtypes in independent gene expression data sets , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[29]  M. Radmacher,et al.  Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. , 2003, Journal of the National Cancer Institute.

[30]  Edward R. Dougherty,et al.  Is cross-validation valid for small-sample microarray classification? , 2004, Bioinform..

[31]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[32]  R. Spang,et al.  Predicting the clinical status of human breast cancer by using gene expression profiles , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Wolfgang Huber,et al.  A Compendium to Ensure Computational Reproducibility in High-Dimensional Classification Tasks , 2004, Statistical applications in genetics and molecular biology.

[34]  Manuela Gariboldi,et al.  Limits of predictive models using microarray data for breast cancer clinical treatment outcome. , 2005, Journal of the National Cancer Institute.

[35]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[36]  W. Hiddemann,et al.  CLINICAL OBSERVATIONS, INTERVENTIONS, AND THERAPEUTIC TRIALS Global approach to the diagnosis of leukemia using gene expression profiling , 2022 .

[37]  Trevor Hastie,et al.  Class Prediction by Nearest Shrunken Centroids, with Applications to DNA Microarrays , 2003 .

[38]  Stefan Michiels,et al.  Prediction of cancer outcome with microarrays: a multiple random validation strategy , 2005, The Lancet.