Statistical Analysis of DNA Microarray Data in Cancer Research

Microarray techniques have been widely used to monitor gene expression in many areas of biomedical research. They have been widely used for tumor diagnosis and classification, prediction of prognoses and treatment, and understanding of molecular mechanisms, biochemical pathways, and gene networks. Statistical methods are vital for these scientific endeavors. This article reviews recent developments of statistical methods for analyzing data from microarray experiments. Emphasis has been given to normalization of expression from multiple arrays, selecting significantly differentially expressed genes, tumor classifications, and gene expression pathways and networks.

[1]  Ingrid Lönnstedt Replicated microarray data , 2001 .

[2]  Jian Huang,et al.  A Two-Way Semilinear Model for Normalization and Analysis of cDNA Microarray Data , 2005 .

[3]  Runze Li,et al.  Statistical Challenges with High Dimensionality: Feature Selection in Knowledge Discovery , 2006, math/0602133.

[4]  P. Tam,et al.  Normalization and analysis of cDNA microarrays using within-array replications applied to neuroblastoma cell response to a cytokine. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Jianqing Fan,et al.  Semilinear High-Dimensional Model for Normalization of Microarray Data , 2005 .

[6]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[7]  X. Cui,et al.  Improved statistical tests for differential gene expression by shrinking variance components estimates. , 2005, Biostatistics.

[8]  Jianqing Fan,et al.  Removing intensity effects and identifying significant genes for Affymetrix arrays in macrophage migration inhibitory factor-suppressed neuroblastoma cells. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[9]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[10]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .

[11]  Gordon K. Smyth,et al.  Use of within-array replicate spots for assessing differential expression in microarray experiments , 2005, Bioinform..

[12]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[13]  Hongzhe Li,et al.  Gradient directed regularization for sparse Gaussian concentration graphs, with applications to inference of genetic networks. , 2006, Biostatistics.

[14]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  G. Churchill,et al.  Experimental design for gene expression microarrays. , 2001, Biostatistics.

[16]  M. Oh,et al.  Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. , 2001, Nucleic acids research.

[17]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[18]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[19]  Yoav Benjamini,et al.  Identifying differentially expressed genes using false discovery rate controlling procedures , 2003, Bioinform..

[20]  S. Dudoit,et al.  Multiple Hypothesis Testing in Microarray Experiments , 2003 .

[21]  T. Speed,et al.  Summaries of Affymetrix GeneChip probe level data. , 2003, Nucleic acids research.

[22]  Heping Zhang,et al.  Cell and tumor classification using gene expression data: Construction of forests , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[23]  B. Efron Large-Scale Simultaneous Hypothesis Testing , 2004 .

[24]  Roland Eils,et al.  Microarray analysis reveals differential gene expression patterns and regulation of single target genes contributing to the opposing phenotype of TrkA- and TrkB-expressing neuroblastomas , 2005, Oncogene.

[25]  J. Perez-Polo,et al.  Statistical approach to DNA chip analysis. , 2003, Recent progress in hormone research.

[26]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Jian Huang,et al.  Robust semiparametric microarray normalization and significance analysis. , 2006, Biometrics.

[28]  Carl Virtanen,et al.  Two subclasses of lung squamous cell carcinoma with different gene expression profiles and prognosis identified by hierarchical clustering and non-negative matrix factorization , 2005, Oncogene.