Microarray Experiment Design and Statistical Analysis

Microarray technology is a new kid on the block for biological researchers, and it has shown a great potential in many fields. While biologists are thrilled by the power of this new technology, they are also over-whelmed by the enormous amount of data generated, and often feel uncomfortable about how to extract useful information from the data. On one hand, there is always some information to be mined in such big data sets; on the other hand, it is almost certain that spurious findings will result even if extreme care has been exercised in mining the data. This challenge mandates interdisciplinary cooperation among biologists, statisticians and computer scientists. Although research in this area is still in its infancy, it is our goal in this chapter to discuss some of the most prominent issues in experiment design and statistical analysis for microarray research and to present possible solutions to some of these problems.

[1]  A. Butte,et al.  Microarrays for an Integrative Genomics , 2002 .

[2]  J. Booth,et al.  Resampling-Based Multiple Testing. , 1994 .

[3]  Douglas M. Hawkins,et al.  A variance-stabilizing transformation for gene-expression microarray data , 2002, ISMB.

[4]  D. B. Duncan MULTIPLE RANGE AND MULTIPLE F TESTS , 1955 .

[5]  G. Churchill,et al.  Experimental design for gene expression microarrays. , 2001, Biostatistics.

[6]  B. Weir,et al.  A systematic statistical linear modeling approach to oligonucleotide array experiments. , 2002, Mathematical biosciences.

[7]  Y. Benjamini,et al.  Controlling the false discovery rate in behavior genetics research , 2001, Behavioural Brain Research.

[8]  S. R. Searle,et al.  Generalized, Linear, and Mixed Models , 2005 .

[9]  A. Galecki,et al.  Interpretation, design, and analysis of gene array expression experiments. , 2001, The journals of gerontology. Series A, Biological sciences and medical sciences.

[10]  S. Dudoit,et al.  STATISTICAL METHODS FOR IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN REPLICATED cDNA MICROARRAY EXPERIMENTS , 2002 .

[11]  Gary A. Churchill,et al.  Analysis of Variance for Gene Expression Microarray Data , 2000, J. Comput. Biol..

[12]  Gregory R. Grant,et al.  Statistical Methods in Bioinformatics , 2001 .

[13]  Cheng Li,et al.  Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application , 2001, Genome Biology.

[14]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Pierre R. Bushel,et al.  STATISTICAL ANALYSIS OF A GENE EXPRESSION MICROARRAY EXPERIMENT WITH REPLICATION , 2002 .

[16]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[17]  J. Shaffer Multiple Hypothesis Testing , 1995 .

[18]  G. Churchill,et al.  Statistical design and the analysis of gene expression microarray data. , 2001, Genetical research.

[19]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[20]  Pierre R. Bushel,et al.  Assessing Gene Significance from cDNA Microarray Expression Data via Mixed Models , 2001, J. Comput. Biol..