Normalization and Technical Variation in Gene Expression Measurements

Using data from the Microarray Quality Control (MAQC) project, we demonstrate two data-analysis methods that shed light on the normalization of gene expression measurements and thereby on their technical variation. One is an improved method for normalization of multiple assays with mRNA concentrations related by a parametric model. The other is a method for characterizing limitations on the effectiveness of normalization in reducing technical variation. We apply our improved normalization to the four project materials as part of testing the linearity of the probe responses. We find that the lack of linearity is statistically significant but small enough that its sources cannot be easily identified. Applying our characterization method to assays of the same material, we show that there is a source of variation that cannot be eliminated by normalization and therefore must be dealt with by other means. Four high-density, single probe, one-color microarray platforms underlie our demonstration.

[1]  R. Tibshirani,et al.  Efficient quadratic regularization for expression arrays. , 2004, Biostatistics.

[2]  Rafael A. Irizarry,et al.  Multiple Lab Comparison of Microarray Platforms , 2004 .

[3]  Richard F. Gunst,et al.  Applied Regression Analysis , 1999, Technometrics.

[4]  Z. Szallasi,et al.  Reliability and reproducibility issues in DNA microarray measurements. , 2006, Trends in genetics : TIG.

[5]  Hanlee P. Ji,et al.  The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. , 2006, Nature biotechnology.

[6]  S. Zamir,et al.  Lower Rank Approximation of Matrices by Least Squares With Any Choice of Weights , 1979 .

[7]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[8]  D. Ruppert,et al.  Transformation and Weighting in Regression , 1988 .

[9]  Geert Molenberghs,et al.  Graphical Exploration of Gene Expression Data: A Comparative Study of Three Multivariate Methods , 2003, Biometrics.

[10]  Michael G. Barnes,et al.  Experimental comparison and cross-validation of the Affymetrix and Illumina gene expression analysis platforms , 2005, Nucleic acids research.

[11]  Bryan Frank,et al.  Independence and reproducibility across microarray platforms , 2005, Nature Methods.

[12]  R. Tibshirani,et al.  Penalized Discriminant Analysis , 1995 .

[13]  Stan Lipovetsky,et al.  Latent Variable Models and Factor Analysis , 2001, Technometrics.

[14]  D. Montgomery,et al.  Design and analysis of gauge R&R studies : making decisions with confidence intervals in random and mixed ANOVA models , 2005 .

[15]  Kathleen Marchal,et al.  A calibration method for estimating absolute expression levels from microarray data , 2006, Bioinform..

[16]  Li Liu,et al.  Robust singular value decomposition analysis of microarray data , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Connie M. Borror,et al.  Design and Analysis of Gauge R&R Studies: Making Decisions with Confidence Intervals in Random and Mixed Anova Models (Asa-Siam Series on Statistics and Applied Probability 17) , 2005 .

[18]  David M. Rocke,et al.  A Model for Measurement Error for Gene Expression Arrays , 2001, J. Comput. Biol..

[19]  P. Kemmeren,et al.  Monitoring global messenger RNA changes in externally controlled microarray experiments , 2003, EMBO reports.

[20]  Leming Shi,et al.  Using RNA sample titrations to assess microarray platform performance and normalization techniques , 2006, Nature Biotechnology.