STATISTICAL ANALYSIS OF GENE EXPRESSION MICROARRAYS

This manuscript is composed of two major sections. In the rst section of the manuscript we introduce some of the biological principles that form the bases of cDNA microarrays and explain how the dierent analytical steps introduce variability and potential biases in gene expression measurements that can sometimes be dicult to properly address. We address statistical issues associated to the measurement of gene expression (e.g., image segmentation, spot identication), to the correction for background uorescence and to the normalization and re-scaling of data to remove eects of dye, print-tip and others on expression. In this section of the manuscript we also describe the standard statistical approaches for estimating treatment eect on gene expression, and briey address the multiple comparisons problem, often referred to as the big p small n paradox. In the second major section of the manuscript, we discuss the use of multiple scans as a means to reduce the variability of gene expression estimates. While the use of multiple scans under the same laser and sensor settings has already been proposed (Romualdi et al. 2003), we describe a general hierarchical modeling approach proposed by Love and Carriquiry (2005) that enables use of all the readings obtained under varied laser and sensor settings for each slide in the analyses, even if the number of readings per slide vary across slides. This technique also uses the varied settings to correct for some amount of the censoring discussed in the rst section. It is to be expected that when combining scans and correcting for censoring, the estimate of gene expression will have smaller variance than it would have if based on a single spot measurement. In turn, expression estimates with smaller variance are expected to increase the power of statistical tests performed on them.

[1]  C. L. Armstrong,et al.  Establishment and maintenance of friable, embryogenic maize callus and the involvement of L-proline , 1985, Planta.

[2]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[3]  G. Churchill,et al.  Experimental design for gene expression microarrays. , 2001, Biostatistics.

[4]  T. Speed,et al.  Statistical issues in cDNA microarray data analysis. , 2003, Methods in molecular biology.

[5]  Christina Kendziorski,et al.  On Differential Variability of Expression Ratios: Improving Statistical Inference about Gene Expression Changes from Microarray Data , 2001, J. Comput. Biol..

[6]  C M Kendziorski,et al.  On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles , 2003, Statistics in medicine.

[7]  Chiara Romualdi,et al.  Improved detection of differentially expressed genes in microarray experiments through multiple scanning and image integration. , 2003, Nucleic acids research.

[8]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[9]  E. Hovig,et al.  Profound influence of microarray scanner characteristics on gene expression ratios: analysis and procedure for correction , 2004, BMC Genomics.

[10]  S. Dudoit,et al.  Multiple Hypothesis Testing in Microarray Experiments , 2003 .

[11]  John Aach,et al.  Measuring absolute expression with microarrays with a calibrated reference sample and an extended signal intensity range , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[12]  S. Dudoit,et al.  Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. , 2002, Nucleic acids research.

[13]  Ingrid Lönnstedt Replicated microarray data , 2001 .

[14]  John D. Storey A direct approach to false discovery rates , 2002 .

[15]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[16]  Tanzy M. T. Love,et al.  Incorporating Multiple cDNA Microarray Slide Scans-Application to Somatic Embryogenesis in Maize 1 , 2004 .

[17]  D. Edwards,et al.  Statistical Analysis of Gene Expression Microarray Data , 2003 .

[18]  R. Phillips,et al.  Plant Regeneration from Tissue Cultures of Maize 1 , 1975 .

[19]  Kan Wang,et al.  Gene Expression Patterns During Somatic Embryo Development and Germination in Maize Hi II Callus Cultures , 2006, Plant Molecular Biology.

[20]  Oswaldo Trelles,et al.  Saturation and Quantization Reduction in Microarray Experiments using Two Scans at Different Sensitivities , 2004, Statistical applications in genetics and molecular biology.