Analysis of microarray gene expression data

This chapter reviews the methods utilized in processing and analysis of gene expression data generated using DNA microarrays. This type of experiment allows relative levels of mRNA abundance in a set of tissue samples or cell populations to be determined for thousands of genes simultaneously. Naturally, such an experiment requires computational and statistical analysis techniques. As processing begins, the computational procedures are largely determined by the technology and experimental setup used. Subsequently, as more reliable intensity values for genes emerge, pattern discovery methods come into play. The most striking peculiarity of this kind of data is that one usually obtains measurements for thousands of genes for a much smaller number of conditions. This is at the root of several of the statistical questions discussed here.

[1]  Welch Bl THE GENERALIZATION OF ‘STUDENT'S’ PROBLEM WHEN SEVERAL DIFFERENT POPULATION VARLANCES ARE INVOLVED , 1947 .

[2]  R. Clarke,et al.  Theory and Applications of Correspondence Analysis , 1985 .

[3]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[4]  John C. W. Rayner,et al.  Welch's approximate solution for the Behrens-Fisher problem , 1987 .

[5]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[6]  G. Lennon,et al.  Hybridization analyses of arrayed cDNA libraries. , 1991, Trends in genetics : TIG.

[7]  Yogendra P. Chaubey Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment , 1993 .

[8]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[9]  Y. Chen,et al.  Ratio-based decisions and the quantitative analysis of cDNA microarray images. , 1997, Journal of biomedical optics.

[10]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[11]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[12]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[13]  M. Bittner,et al.  Expression profiling using cDNA microarrays , 1999, Nature Genetics.

[14]  J. Claverie Computational methods for the identification of differential and coordinated gene expression. , 1999, Human molecular genetics.

[15]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[16]  U. Alon,et al.  Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Zohar Yakhini,et al.  Clustering gene expression patterns , 1999, J. Comput. Biol..

[18]  S. P. Fodor,et al.  High density synthetic oligonucleotide arrays , 1999, Nature Genetics.

[19]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Andrea Califano,et al.  Analysis of Gene Expression Microarrays for Phenotype Classification , 2000, ISMB.

[21]  D. Botstein,et al.  Singular value decomposition for genome-wide expression data processing and modeling. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Trey Ideker,et al.  Testing for Differentially-Expressed Genes by Maximum-Likelihood Analysis of Microarray Data , 2000, J. Comput. Biol..

[23]  Gary A. Churchill,et al.  Analysis of Variance for Gene Expression Microarray Data , 2000, J. Comput. Biol..

[24]  Robert Tibshirani,et al.  Microarrays and Their Use in a Comparative Experiment , 2000 .

[25]  E. Wolski,et al.  Normalization strategies for cDNA microarrays. , 2000, Nucleic acids research.

[26]  Martin Vingron,et al.  Processing and quality control of DNA array hybridization data , 2000, Bioinform..

[27]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[28]  G. Getz,et al.  Coupled two-way clustering analysis of gene microarray data. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Nir Friedman,et al.  Tissue classification with gene expression profiles. , 2000 .

[30]  L. Lazzeroni Plaid models for gene expression data , 2000 .

[31]  Ingrid Lönnstedt Replicated microarray data , 2001 .

[32]  Pierre Baldi,et al.  A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes , 2001, Bioinform..

[33]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[34]  J. Thomas,et al.  An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. , 2001, Genome research.

[35]  J. Hoheisel,et al.  Correspondence analysis applied to microarray data , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[36]  G. Churchill,et al.  Statistical design and the analysis of gene expression microarray data. , 2001, Genetical research.

[37]  Rainer Fuchs,et al.  Bayesian Estimation of Fold-Changes in the Analysis of Gene Expression: The PFOLD Algorithm , 2001, J. Comput. Biol..

[38]  Nir Friedman,et al.  Class discovery in gene expression data , 2001, RECOMB.

[39]  Kevin R Coombes,et al.  Sources of nonlinearity in cDNA microarray expression measurements , 2001, Genome Biology.

[40]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[41]  Christina Kendziorski,et al.  On Differential Variability of Expression Ratios: Improving Statistical Inference about Gene Expression Changes from Microarray Data , 2001, J. Comput. Biol..

[42]  Mike West,et al.  Prediction and uncertainty in the analysis of gene expression profiles , 2002, Silico Biol..

[43]  Martin Vingron,et al.  Identifying splits with clear separation: a new class discovery method for gene expression data , 2001, ISMB.

[44]  Kevin R. Coombes,et al.  Identifying Differentially Expressed Genes in cDNA Microarray Experiments , 2001, J. Comput. Biol..

[45]  Tommi S. Jaakkola,et al.  Fast optimal leaf ordering for hierarchical clustering , 2001, ISMB.

[46]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[47]  Adrian E. Raftery,et al.  Model-based clustering and data transformations for gene expression data , 2001, Bioinform..

[48]  David M. Rocke,et al.  A Model for Measurement Error for Gene Expression Arrays , 2001, J. Comput. Biol..

[49]  R. Nuttall,et al.  An evaluation of the performance of cDNA microarrays for detecting changes in global mRNA expression. , 2001, Nucleic acids research.

[50]  R Herwig,et al.  Statistical evaluation of differential expression on cDNA nylon arrays with replicated experiments. , 2001, Nucleic acids research.

[51]  M K Kerr,et al.  Bootstrapping cluster analysis: Assessing the reliability of conclusions from microarray experiments , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[52]  Debashis Ghosh,et al.  Mixture modelling of gene expression data from microarray experiments , 2002, Bioinform..

[53]  T. Kepler,et al.  Normalization and analysis of DNA microarray data by self-consistency and local regression , 2002, Genome Biology.

[54]  Ron O. Dror,et al.  A bayesian approach to transcript estimation from gene array data: the BEAM technique , 2002, RECOMB '02.

[55]  Terence P. Speed,et al.  Comparison of Methods for Image Analysis on cDNA Microarray Data , 2002 .

[56]  T. Speed,et al.  Design issues for cDNA microarray experiments , 2002, Nature Reviews Genetics.

[57]  G. Churchill Fundamentals of experimental design for cDNA microarrays , 2002, Nature Genetics.

[58]  Wei Pan,et al.  A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments , 2002, Bioinform..

[59]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[60]  Richard M. Karp,et al.  Discovering local structure in gene expression data: the order-preserving submatrix problem , 2002, RECOMB '02.

[61]  S. Dudoit,et al.  Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data , 2002 .

[62]  Geoffrey J. McLachlan,et al.  A mixture model-based approach to the clustering of microarray expression data , 2002, Bioinform..

[63]  S. Dudoit,et al.  A prediction-based resampling method for estimating the number of clusters in a dataset , 2002, Genome Biology.

[64]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .

[65]  Douglas M. Hawkins,et al.  A variance-stabilizing transformation for gene-expression microarray data , 2002, ISMB.

[66]  M. J. van der Laan,et al.  Statistical inference for simultaneous clustering of gene expression data. , 2002, Mathematical biosciences.

[67]  Roberto Marcondes Cesar Junior,et al.  Inference from Clustering with Application to Gene-Expression Microarrays , 2002, J. Comput. Biol..

[68]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[69]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .