Oligonucleotide arrays: information from replication and spatial structure

MOTIVATION The introduction of oligonucleotide DNA arrays has resulted in much debate concerning appropriate models for the measurement of gene expression. By contrast, little account has been taken of the possibility of identifying the physical imperfections in the raw data. RESULTS This paper demonstrates that, with the use of replicates and an awareness of the spatial structure, deficiencies in the data can be identified, the possibility of their correction can be ascertained and correction can be effected (by use of local scaling) where possible. The procedures were motivated by data from replicates of Arabidopsis thaliana using the GeneChip ATH1-121501 microarray. Similar problems are illustrated for GeneChip Human Genome U133 arrays and for the newer and larger GeneChip Wheat Genome microarray. AVAILABILITY R code is freely available on request.

[1]  M. Bittner,et al.  Data management and analysis for gene expression arrays , 1998, Nature Genetics.

[2]  G. A. Whitmore,et al.  Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Felix Naef,et al.  Empirical characterization of the expression ratio noise structure in high-density oligonucleotide arrays , 2002, Genome Biology.

[4]  C. Li,et al.  Feature extraction and normalization algorithms for high‐density oligonucleotide gene expression array data , 2001, Journal of cellular biochemistry. Supplement.

[5]  David M. Rocke,et al.  Approximate Variance-stabilizing Transformations for Gene-expression Microarray Data , 2003, Bioinform..

[6]  Fred A. Wright,et al.  Theoretical and experimental comparisons of gene expression indexes for oligonucleotide arrays , 2002, Bioinform..

[7]  S. P. Fodor,et al.  High density synthetic oligonucleotide arrays , 1999, Nature Genetics.

[8]  Tommi S. Jaakkola,et al.  Maximum-likelihood estimation of optimal scaling factors for expression array normalization , 2001, SPIE BiOS.

[9]  Jacques Corbeil,et al.  Statistical analysis of high-density oligonucleotide arrays: a multiplicative noise model , 2002, Bioinform..

[10]  Lu Lu,et al.  The genetic structure of recombinant inbred mice: high-resolution consensus maps for complex trait analysis , 2001, Genome Biology.

[11]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[12]  T. Speed,et al.  Summaries of Affymetrix GeneChip probe level data. , 2003, Nucleic acids research.

[13]  W. Pan,et al.  How many replicates of arrays are required to detect gene expression changes in microarray experiments? A mixture model approach , 2002, Genome Biology.

[14]  S. Dudoit,et al.  Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. , 2002, Nucleic acids research.

[15]  John Quackenbush Microarray data normalization and transformation , 2002, Nature Genetics.

[16]  Fabian Model,et al.  Statistical process control for large scale microarray experiments , 2002, ISMB.

[17]  Benjamin M. Bolstad,et al.  affy - analysis of Affymetrix GeneChip data at the probe level , 2004, Bioinform..

[18]  Felix Naef,et al.  From features to expression: High-density oligonucleotide array analysis revisited , 2001 .

[19]  David M. Rocke,et al.  Estimation of Transformation Parameters for Microarray Data , 2003, Bioinform..

[20]  O. Zakhleniuk,et al.  Responses of primary and secondary metabolism to sugar accumulation revealed by microarray expression analysis of the Arabidopsis mutant, pho3. , 2004, Journal of experimental botany.

[21]  David M. Rocke,et al.  A Model for Measurement Error for Gene Expression Arrays , 2001, J. Comput. Biol..

[22]  C. Li,et al.  Analyzing high‐density oligonucleotide gene expression array data , 2001, Journal of cellular biochemistry.

[23]  S. Knudsen,et al.  A new non-linear normalization method for reducing variability in DNA microarray experiments , 2002, Genome Biology.

[24]  J. Görlach,et al.  Growth Stage–Based Phenotypic Analysis of Arabidopsis , 2001, The Plant Cell Online.

[25]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[26]  D. Slonim,et al.  Evaluation of normalization procedures for oligonucleotide array data based on spiked cRNA controls , 2001, Genome Biology.

[27]  Dung-Tsa Chen,et al.  Gene selection for oligonucleotide array: an approach using PM probe level data , 2004, Bioinform..

[28]  D. Lockhart,et al.  Expression monitoring by hybridization to high-density oligonucleotide arrays , 1996, Nature Biotechnology.

[29]  Jens Timmer,et al.  Normalization of DNA-Microarray Data by Nonlinear Correlation Maximization , 2003, J. Comput. Biol..

[30]  Mark Schena,et al.  Microarray Biochip Technology , 2000 .

[31]  B. Weir,et al.  A systematic statistical linear modeling approach to oligonucleotide array experiments. , 2002, Mathematical biosciences.

[32]  Roger E Bumgarner,et al.  Sample size for detecting differentially expressed genes in microarray experiments , 2004, BMC Genomics.

[33]  Wei-Min Liu,et al.  Robust estimators for expression analysis , 2002, Bioinform..