The impact of quantitative optimization of hybridization conditions on gene expression analysis

BackgroundWith the growing availability of entire genome sequences, an increasing number of scientists can exploit oligonucleotide microarrays for genome-scale expression studies. While probe-design is a major research area, relatively little work has been reported on the optimization of microarray protocols.ResultsAs shown in this study, suboptimal conditions can have considerable impact on biologically relevant observations. For example, deviation from the optimal temperature by one degree Celsius lead to a loss of up to 44% of differentially expressed genes identified. While genes from thousands of Gene Ontology categories were affected, transcription factors and other low-copy-number regulators were disproportionately lost. Calibrated protocols are thus required in order to take full advantage of the large dynamic range of microarrays.For an objective optimization of protocols we introduce an approach that maximizes the amount of information obtained per experiment. A comparison of two typical samples is sufficient for this calibration. We can ensure, however, that optimization results are independent of the samples and the specific measures used for calibration. Both simulations and spike-in experiments confirmed an unbiased determination of generally optimal experimental conditions.ConclusionsWell calibrated hybridization conditions are thus easily achieved and necessary for the efficient detection of differential expression. They are essential for the sensitive pro filing of low-copy-number molecules. This is particularly critical for studies of transcription factor expression, or the inference and study of regulatory networks.

[1]  Jeremy Buhler,et al.  Dapple: Improved Techniques for Finding Spots on DNA Microarrays , 2000 .

[2]  Tao Han,et al.  Cross-platform comparability of microarray technology: Intra-platform consistency and appropriate data analysis procedures are essential , 2005, BMC Bioinformatics.

[3]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[4]  Lorenz Wernisch,et al.  Analysis of whole-genome microarray replicates using mixed models , 2003, Bioinform..

[5]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[6]  E. Southern,et al.  Molecular interactions on microarrays , 1999, Nature Genetics.

[7]  David M. Rocke,et al.  A Model for Measurement Error for Gene Expression Arrays , 2001, J. Comput. Biol..

[8]  Angela Relógio,et al.  Optimization of oligonucleotide-based DNA microarrays. , 2002, Nucleic acids research.

[9]  Michael Zuker,et al.  DINAMelt web server for nucleic acid melting prediction , 2005, Nucleic Acids Res..

[10]  David J. Hand,et al.  Construction and Assessment of Classification Rules , 1997 .

[11]  P. A. Peterson,et al.  DNA Microarrays of the Complex Human Cytomegalovirus Genome: Profiling Kinetic Class with Drug Sensitivity of Viral Gene Expression , 1999, Journal of Virology.

[12]  David P. Kreil,et al.  Microarray oligonucleotide probes. , 2006, Methods in enzymology.

[13]  A D Tsodikov,et al.  Thermodynamic calculations and statistical correlations for oligo-probes design. , 2003, Nucleic acids research.

[14]  G. Church,et al.  Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset , 2005, Genome Biology.

[15]  Martin Vingron,et al.  Variance stabilization applied to microarray data calibration and to the quantification of differential expression , 2002, ISMB.

[16]  Henrik Bjørn Nielsen,et al.  Improving comparability between microarray probe signals by thermodynamic intensity correction. , 2007, Nucleic acids research.

[17]  David P. Kreil,et al.  Model-based probe set optimization for high-performance microarrays , 2008, Nucleic acids research.

[18]  R. Fisher 019: On the Interpretation of x2 from Contingency Tables, and the Calculation of P. , 1922 .

[19]  Hanlee P. Ji,et al.  The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. , 2006, Nature biotechnology.

[20]  Rafael A. Irizarry,et al.  Bioinformatics and Computational Biology Solutions using R and Bioconductor , 2005 .

[21]  D. Lockhart,et al.  Expression monitoring by hybridization to high-density oligonucleotide arrays , 1996, Nature Biotechnology.

[22]  L. Ein-Dor,et al.  Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[23]  E. O’Shea,et al.  Global analysis of protein expression in yeast , 2003, Nature.

[24]  Maqc Consortium The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements , 2006, Nature Biotechnology.

[25]  Ivo L. Hofacker,et al.  Hybridization thermodynamics of NimbleGen Microarrays , 2010, BMC Bioinformatics.

[26]  T. Oei,et al.  An evaluation of four serum tests for pregnancy. , 1983, Clinical chemistry.

[27]  Zhijin Wu,et al.  Feature-level exploration of a published Affymetrix GeneChip control dataset , 2006, Genome Biology.

[28]  Tao Han,et al.  Improvement in the Reproducibility and Accuracy of DNA Microarray Quantification by Optimizing Hybridization Conditions , 2006, BMC Bioinformatics.

[29]  M. Zuker,et al.  Prediction of hybridization and melting for double-stranded nucleic acids. , 2004, Biophysical journal.

[30]  Gordon K. Smyth,et al.  Statistical analysis of an RNA titration series evaluates microarray precision and sensitivity on a whole-array basis , 2006, BMC Bioinformatics.

[31]  Franco Cerrina,et al.  Gene expression analysis using oligonucleotide arrays produced by maskless photolithography. , 2002, Genome research.

[32]  R. Stoughton Applications of DNA microarrays in biology. , 2005, Annual review of biochemistry.

[33]  J. Davis Bioinformatics and Computational Biology Solutions Using R and Bioconductor , 2007 .

[34]  R. Fisher On the Interpretation of χ2 from Contingency Tables, and the Calculation of P , 2010 .

[35]  J Carl Barrett,et al.  Microarrays : the use of oligonucleotides and cDNA for the analysis of gene expression , 2003 .

[36]  Sunduz Keles,et al.  A study of the relationships between oligonucleotide properties and hybridization signal intensities from NimbleGen microarray datasets , 2008, Nucleic acids research.

[37]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[38]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[39]  Steven Russell,et al.  SimArray: a user-friendly and user-configurable microarray design tool , 2006, BMC Bioinformatics.

[40]  Mike West,et al.  Prediction and uncertainty in the analysis of gene expression profiles , 2002, Silico Biol..

[41]  Leming Shi,et al.  Using RNA sample titrations to assess microarray platform performance and normalization techniques , 2006, Nature Biotechnology.

[42]  J. SantaLucia,et al.  The thermodynamics of DNA structural motifs. , 2004, Annual review of biophysics and biomolecular structure.

[43]  Stephen J. Roberts,et al.  Gene ranking using bootstrapped P-values , 2003, SKDD.

[44]  T. Fennell,et al.  Targeted next-generation sequencing of a cancer transcriptome enhances detection of sequence variants and novel fusion transcripts , 2009, Genome Biology.

[45]  Michael Zuker,et al.  Mfold web server for nucleic acid folding and hybridization prediction , 2003, Nucleic Acids Res..

[46]  Yudong D. He,et al.  Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer , 2001, Nature Biotechnology.

[47]  Yuhong Yang,et al.  Information Theory, Inference, and Learning Algorithms , 2005 .

[48]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[49]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[50]  Sarah A. Teichmann,et al.  FlyTF: a systematic review of site-specific transcription factors in the fruit fly Drosophila melanogaster , 2006, Bioinform..

[51]  Gos Micklem,et al.  Bayesian Modelling of Shared Gene Function , 2022 .

[52]  Gos Micklem,et al.  A friendly statistics package for microarray analysis , 2005, Bioinform..

[53]  BMC Bioinformatics , 2005 .

[54]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[55]  Marina Vannucci,et al.  Gene selection: a Bayesian variable selection approach , 2003, Bioinform..