Microarray experiments and factors which affect their reliability

AbstractOligonucleotide microarrays belong to the basic tools of molecular biology and allow for simultaneous assessment of the expression level of thousands of genes. Analysis of microarray data is however very complex, requiring sophisticated methods to control for various factors that are inherent to the procedures used. In this article we describe the individual steps of a microarray experiment, highlighting important elements and factors that may affect the processes involved and that influence the interpretation of the results. Additionally, we describe methods that can be used to estimate the influence of these factors, and to control the way in which they affect the expression estimates. A comprehensive understanding of the experimental protocol used in a microarray experiment aids the interpretation of the obtained results. By describing known factors which affect expression estimates this article provides guidelines for appropriate quality control and pre-processing of the data, additionally applicable to other transcriptome analysis methods that utilize similar sample handling protocols. Reviewers: This article was reviewed by Dr. Janet Siefert, Dr. Leonid Hanin, and Dr. I King Jordan.

[1]  Crispin J. Miller,et al.  Amplification protocols introduce systematic but reproducible errors into gene expression studies. , 2004, BioTechniques.

[2]  Gos Micklem,et al.  The impact of quantitative optimization of hybridization conditions on gene expression analysis , 2011, BMC Bioinformatics.

[3]  G. Kennedy,et al.  Reconstructed Ancestral Sequences Improve Pathogen Identification Using Resequencing DNA Microarrays , 2010, PloS one.

[4]  Taesung Park,et al.  Evaluation of normalization methods for microarray data , 2003 .

[5]  Wayne E. Clarke,et al.  Genomic DNA Enrichment Using Sequence Capture Microarrays: a Novel Approach to Discover Sequence Nucleotide Polymorphisms (SNP) in Brassica napus L , 2013, PloS one.

[6]  Maqc Consortium The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements , 2006, Nature Biotechnology.

[7]  Rafael A. Irizarry,et al.  A Model-Based Background Adjustment for Oligonucleotide Expression Arrays , 2004 .

[8]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[9]  M. Legendre,et al.  Efficacy of RNA amplification is dependent on sequence characteristics: implications for gene expression profiling using a cDNA microarray. , 2008, Genomics.

[10]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[11]  Y. Benjamini,et al.  Summarizing and correcting the GC content bias in high-throughput sequencing , 2012, Nucleic acids research.

[12]  Wolfgang Huber,et al.  Systematic analysis of T7 RNA polymerase based in vitro linear RNA amplification for use in microarray experiments , 2004, BMC Genomics.

[13]  B. Lausen,et al.  Comparability of Microarray Data between Amplified and Non Amplified RNA in Colorectal Carcinoma , 2009, Journal of biomedicine & biotechnology.

[14]  L. Farinelli,et al.  Chromatin immunoprecipitation (ChIP) of plant transcription factors followed by sequencing (ChIP-SEQ) or hybridization to whole genome arrays (ChIP-CHIP) , 2010, Nature Protocols.

[15]  Thomas Ragg,et al.  The RIN: an RNA integrity number for assigning integrity values to RNA measurements , 2006, BMC Molecular Biology.

[16]  Stéphane Robin,et al.  Amplification biases: possible differences among deviating gene expressions , 2008, BMC Genomics.

[17]  M. Bonin,et al.  High-density oligonucleotide-based resequencing assay for mutations causing syndromic and non-syndromic forms of thoracic aortic aneurysms and dissections. , 2013, Molecular and cellular probes.

[18]  Yudong D. He,et al.  Effects of atmospheric ozone on microarray data quality. , 2003, Analytical chemistry.

[19]  Kenneth H Buetow,et al.  Interlaboratory comparability study of cancer gene expression analysis using oligonucleotide microarrays. , 2005, Clinical cancer research : an official journal of the American Association for Cancer Research.

[20]  S. Macleod,et al.  Cheek swabs, SNP chips, and CNVs: Assessing the quality of copy number variant calls generated with subject-collected mail-in buccal brush DNA samples on a high-density genotyping microarray , 2012, BMC Medical Genetics.

[21]  Makoto Nakanishi,et al.  A20 Is a Negative Regulator of IFN Regulatory Factor 3 Signaling1 , 2005, The Journal of Immunology.

[22]  Todd H. Stokes,et al.  caCORRECT2: Improving the accuracy and reliability of microarray data in the presence of artifacts , 2011, BMC Bioinformatics.

[23]  Michael B. Black,et al.  IVT-seq reveals extreme bias in RNA sequencing , 2014, Genome Biology.

[24]  Klaus Obermayer,et al.  A new summarization method for affymetrix probe level data , 2006, Bioinform..

[25]  T. LaFramboise,et al.  Single nucleotide polymorphism arrays: a decade of biological, computational and technological advances , 2009, Nucleic acids research.

[26]  J. Shendure The beginning of the end for microarrays? , 2008, Nature Methods.

[27]  Wei Zheng,et al.  Bias detection and correction in RNA-Sequencing data , 2011, BMC Bioinformatics.

[28]  A. Makeyev,et al.  ChIP-Chip Identifies SEC23A, CFDP1, and NSD1 as TFII-I Target Genes in Human Neural Crest Progenitor Cells , 2013, The Cleft palate-craniofacial journal : official publication of the American Cleft Palate-Craniofacial Association.

[29]  R. Jaksik,et al.  Calculation of reliable transcript levels of annotated genes on the basis of multiple probe-sets in Affymetrix microarrays. , 2009, Acta biochimica Polonica.

[30]  Benjamin M. Bolstad,et al.  affy - analysis of Affymetrix GeneChip data at the probe level , 2004, Bioinform..

[31]  Yonghong Wang,et al.  Characterization of mismatch and high-signal intensity probes associated with Affymetrix genechips , 2007, Bioinform..

[32]  Yudi Pawitan,et al.  Filtering genes to improve sensitivity in oligonucleotide microarray data analysis. , 2007, Nucleic acids research.

[33]  Daniel Brewer,et al.  Interlaboratory and interplatform comparison of microarray gene expression analysis of HepG2 cells exposed to benzo(a)pyrene. , 2009, Omics : a journal of integrative biology.

[34]  Jizhong Zhou,et al.  Design and analysis of mismatch probes for long oligonucleotide microarrays , 2008, BMC Genomics.

[35]  Mayte Suárez-Fariñas,et al.  Harshlight: a "corrective make-up" program for microarray chips , 2005, BMC Bioinformatics.

[36]  Jacek Majewski,et al.  Gene Expression and Isoform Variation Analysis using Affymetrix Exon Arrays , 2008, BMC Genomics.

[37]  Weida Tong,et al.  Evaluation of external RNA controls for the assessment of microarray performance , 2006, Nature Biotechnology.

[38]  Michael G. Barnes,et al.  Experimental comparison and cross-validation of the Affymetrix and Illumina gene expression analysis platforms , 2005, Nucleic acids research.

[39]  James B Thissen,et al.  Optimizing SNP microarray probe design for high accuracy microbial genotyping. , 2013, Journal of microbiological methods.

[40]  Daoud Sie,et al.  The T7-Primer Is a Source of Experimental Bias and Introduces Variability between Microarray Platforms , 2008, PloS one.

[41]  I. Bièche,et al.  High‐resolution oligonucleotide array‐CGH applied to the detection and characterization of large rearrangements in the hereditary breast cancer gene BRCA1 , 2007, Clinical genetics.

[42]  S. Mohammed,et al.  Validation and implementation of array comparative genomic hybridisation as a first line test in place of postnatal karyotyping for genome imbalance , 2010, Molecular Cytogenetics.

[43]  G. Grinstein,et al.  Relationship between gene expression and observed intensities in DNA microarrays—a modeling study , 2006, Nucleic acids research.

[44]  P. Jaouen,et al.  Physicochemical factors affecting the stability of two pigments: R-phycoerythrin of Grateloupia turuturu and B-phycoerythrin of Porphyridium cruentum. , 2014, Food chemistry.

[45]  D. Lockhart,et al.  Expression monitoring by hybridization to high-density oligonucleotide arrays , 1996, Nature Biotechnology.

[46]  Mark Gerstein,et al.  Assessing the need for sequence-based normalization in tiling microarray experiments , 2007, Bioinform..

[47]  J. Rowley,et al.  Oligo(dT) primer generates a high frequency of truncated cDNAs through internal poly(A) priming during reverse transcription , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Stephen C. Harris,et al.  Rat toxicogenomic study reveals analytical consistency across microarray platforms , 2006, Nature Biotechnology.

[49]  Lahiri Kanth Nanduri,et al.  Validation of microarray‐based resequencing of 93 worldwide mitochondrial genomes , 2009, Human mutation.

[50]  Jochen Gaedcke,et al.  Impact of RNA degradation on gene expression profiling , 2010, BMC Medical Genomics.

[51]  H. Koltai,et al.  Specificity of DNA microarray hybridization: characterization, effectors and approaches for data correction , 2008, Nucleic acids research.

[52]  N. Henderson,et al.  Site-specific DNA recombination in mammalian cells by the Cre recombinase of bacteriophage P1. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[53]  S. P. Fodor,et al.  Light-generated oligonucleotide arrays for rapid DNA sequence analysis. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[54]  Joanna Polanska,et al.  Sources of High Variance between Probe Signals in Affymetrix Short Oligonucleotide Microarrays , 2013, Sensors.

[55]  David T. Okou,et al.  Microarray‐based mutation detection in the dystrophin gene , 2008, Human mutation.

[56]  D. Geiger,et al.  Polyadenylation of ribosomal RNA in human cells , 2006 .

[57]  S. South,et al.  Comparison of targeted and whole genome analysis of postnatal specimens using a commercially available array based comparative genomic hybridisation (aCGH) microarray platform , 2008, Journal of Medical Genetics.

[58]  Touati Benoukraf,et al.  Processing ChIP-chip data: from the scanner to the browser. , 2011, Methods in molecular biology.

[59]  Joanna Polanska,et al.  Affymetrix Chip Definition Files Construction Based on Custom Probe Set Annotation Database , 2011 .

[60]  Mike J. Mason,et al.  Chromatin-dependent binding of the S. cerevisiae HMGB protein Nhp6A affects nucleosome dynamics and transcription. , 2010, Genes & development.

[61]  R. Stoughton Applications of DNA microarrays in biology. , 2005, Annual review of biochemistry.

[62]  K M Kroll,et al.  Modeling background intensity in DNA microarrays. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[63]  M. Tanner,et al.  Development of a simple microarray for genotyping HIV‐1 drug resistance mutations in the reverse transcriptase gene in rural Tanzania , 2014, Tropical medicine & international health : TM & IH.

[64]  L. Stuyver,et al.  Targeted resequencing of HIV variants by microarray thermodynamics , 2013, Nucleic acids research.

[65]  Cheng Li,et al.  Adjusting batch effects in microarray expression data using empirical Bayes methods. , 2007, Biostatistics.

[66]  Saran Vardhanabhuti,et al.  A comparison of statistical tests for detecting differential expression using Affymetrix oligonucleotide microarrays. , 2006, Omics : a journal of integrative biology.

[67]  H. Binder,et al.  Estimating RNA-quality using GeneChip microarrays , 2012, BMC Genomics.

[68]  Catalin C. Barbacioru,et al.  Evaluation of DNA microarray results with quantitative gene expression platforms , 2006, Nature Biotechnology.

[69]  Layne D. Williams,et al.  Microarray temperature optimization using hybridization kinetics. , 2009, Methods in molecular biology.

[70]  Doron Lancet,et al.  Novel definition files for human GeneChips based on GeneAnnot , 2007, BMC Bioinformatics.

[71]  T. Owa [Drug target validation and identification of secondary drug target effects using DNA microarrays]. , 2007, Tanpakushitsu kakusan koso. Protein, nucleic acid, enzyme.

[72]  Marcel J. T. Reinders,et al.  Delineation of amplification, hybridization and location effects in microarray data yields better-quality normalization , 2010, BMC Bioinformatics.

[73]  Conrad J. Burden,et al.  Washing scaling of GeneChip microarray expression , 2010, BMC Bioinformatics.

[74]  John J. Kelly,et al.  Optimization of Single-Base-Pair Mismatch Discrimination in Oligonucleotide Microarrays , 2003, Applied and Environmental Microbiology.

[75]  H. Hogrefe,et al.  Amplification efficiency of thermostable DNA polymerases. , 2003, Analytical biochemistry.

[76]  Kellie J. Archer,et al.  An application for assessing quality of RNA hybridized to Affymetrix GeneChips , 2006, Bioinform..

[77]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[78]  Li Yang,et al.  Genomewide characterization of non-polyadenylated RNAs , 2011, Genome Biology.

[79]  Graziano Pesole,et al.  UTRdb and UTRsite: a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs , 2004, Nucleic Acids Res..

[80]  Ernesto Picardi,et al.  UTRdb and UTRsite (RELEASE 2010): a collection of sequences and regulatory motifs of the untranslated regions of eukaryotic mRNAs , 2009, Nucleic Acids Res..

[81]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[82]  J. Fuscoe,et al.  Elimination of laboratory ozone leads to a dramatic improvement in the reproducibility of microarray gene expression measurements , 2007, BMC biotechnology.

[83]  Renzo Kottmann,et al.  A standard MIGS/MIMS compliant XML Schema: toward the development of the Genomic Contextual Data Markup Language (GCDML). , 2008, Omics : a journal of integrative biology.

[84]  Z. Szallasi,et al.  Reliability and reproducibility issues in DNA microarray measurements. , 2006, Trends in genetics : TIG.

[85]  Junpei Kawauchi,et al.  Use of Non-Amplified RNA Samples for Microarray Analysis of Gene Expression , 2012, PloS one.

[86]  Temple F. Smith,et al.  SCOREM: statistical consolidation of redundant expression measures , 2011, Nucleic acids research.

[87]  John Quackenbush,et al.  Multiple-laboratory comparison of microarray platforms , 2005, Nature Methods.

[88]  John N. Weinstein,et al.  Quality assessment of microarrays: Visualization of spatial artifacts and quantitation of regional biases , 2005, BMC Bioinformatics.

[89]  G. Ramsay DNA chips: State-of-the art , 1998, Nature Biotechnology.

[90]  D. Postma,et al.  Microarray amplification bias: loss of 30% differentially expressed genes due to long probe – poly(A)-tail distances , 2007, BMC Genomics.

[91]  John Quackenbush,et al.  BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm043 Gene , 2022 .

[92]  Arno Kalkuhl,et al.  Gene Expression Analysis of the Hepatotoxicant Methapyrilene in Primary Rat Hepatocytes: An Interlaboratory Study , 2006, Environmental health perspectives.

[93]  Mario Fasold,et al.  AffyRNADegradation: control and correction of RNA quality effects in GeneChip expression data , 2012, Bioinform..

[94]  R. Myers,et al.  Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data , 2005, Nucleic acids research.

[95]  Cheng Li,et al.  Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application , 2001, Genome Biology.

[96]  A. Green,et al.  Microarrays and Epidemiology: Not the Beginning of the End but the End of the Beginning… , 2007, Cancer Epidemiology Biomarkers & Prevention.

[97]  G. Kennedy,et al.  Application of High-Density DNA Resequencing Microarray for Detection and Characterization of Botulinum Neurotoxin-Producing Clostridia , 2013, PloS one.

[98]  Caroline C. Friedel,et al.  Detection and correction of probe-level artefacts on microarrays , 2012, BMC Bioinformatics.

[99]  S. Tavaré,et al.  Explaining differences in saturation levels for Affymetrix GeneChip® arrays , 2007, Nucleic acids research.

[100]  Andrew C. Stewart,et al.  Genotyping of Bacillus cereus Strains by Microarray-Based Resequencing , 2008, PloS one.