Adjustment method for microarray data generated using two-cycle RNA labeling protocol

BackgroundMicroarray technology is widely utilized for monitoring the expression changes of thousands of genes simultaneously. However, the requirement of relatively large amount of RNA for labeling and hybridization makes it difficult to perform microarray experiments with limited biological materials, thus leads to the development of many methods for preparing and amplifying mRNA. It is addressed that amplification methods usually bring bias, which may strongly hamper the following interpretation of the results. A big challenge is how to correct for the bias before further analysis.ResultsIn this article, we observed the bias in rice gene expression microarray data generated with the Affymetrix one-cycle, two-cycle RNA labeling protocols, followed by validation with Real Time PCR. Based on these data, we proposed a statistical framework to model the processes of mRNA two-cycle linear amplification, and established a linear model for probe level correction. Maximum Likelihood Estimation (MLE) was applied to perform robust estimation of the Retaining Rate for each probe. After bias correction, some known pre-processing methods, such as PDNN, could be combined to finish preprocessing. Then, we evaluated our model and the results suggest that our model can effectively increase the quality of the microarray raw data: (i) Decrease the Coefficient of Variation for PM intensities of probe sets; (ii) Distinguish the microarray samples of five stages for rice stamen development more clearly; (iii) Improve the correlation coefficients among stamen microarray samples. We also discussed the necessity of model adjustment by comparing with another simple adjustment method.ConclusionWe conclude that the adjustment model is necessary and could effectively increase the quality of estimation for gene expression from the microarray raw data.

[1]  Stéphane Robin,et al.  Amplification biases: possible differences among deviating gene expressions , 2008, BMC Genomics.

[2]  E. Brown,et al.  Quantitative analysis of mRNA amplification by in vitro transcription. , 2001, Nucleic acids research.

[3]  Liping Wei,et al.  Molecular analysis of early rice stamen development using organ-specific gene expression profiling , 2006, Plant Molecular Biology.

[4]  Andrej-Nikolai Spiess,et al.  Amplified RNA degradation in T7-amplification methods results in biased microarray hybridizations , 2003, BMC Genomics.

[5]  F. Marincola,et al.  High-fidelity mRNA amplification for gene profiling , 2000, Nature Biotechnology.

[6]  Michael J Holdsworth,et al.  Statistical evaluation of transcriptomic data generated using the Affymetrix one-cycle, two-cycle and IVT-Express RNA labelling protocols with the Arabidopsis ATH1 microarray , 2010, Plant Methods.

[7]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[8]  Xuegong Zhang,et al.  Using non-uniform read distribution models to improve isoform expression inference in RNA-Seq , 2011, Bioinform..

[9]  M. Herrler,et al.  Linear mRNA amplification from as little as 5 ng total RNA for global gene expression analysis. , 2004, BioTechniques.

[10]  Cole Trapnell,et al.  Computational methods for transcriptome annotation and quantification using RNA-seq , 2011, Nature Methods.

[11]  K. Aldape,et al.  A model of molecular interactions on short oligonucleotide microarrays , 2003, Nature Biotechnology.

[12]  E. Wang RNA amplification for successful gene profiling analysis , 2005, Journal of Translational Medicine.

[13]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[14]  Dabing Zhang,et al.  Cytological analysis and genetic control of rice anther development. , 2011, Journal of genetics and genomics = Yi chuan xue bao.

[15]  Chunlei Wu,et al.  Free energy of DNA duplex formation on short oligonucleotide microarrays , 2006, Nucleic acids research.

[16]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[17]  C. Orengo,et al.  Microarray analysis after RNA amplification can detect pronounced differences in gene expression using limma , 2006, BMC Genomics.

[18]  Roy Parker,et al.  Messenger RNA Degradation: Beginning at the End , 2002, Current Biology.

[19]  M. Stephens,et al.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. , 2008, Genome research.

[20]  Steen Knudsen,et al.  Implementation of a gene expression index calculation method based on the PDNN model , 2005, Bioinform..

[21]  Eivind Hovig,et al.  Options available for profiling small samples: a review of sample amplification technology when combined with microarray profiling , 2006, Nucleic acids research.

[22]  D. Postma,et al.  Microarray amplification bias: loss of 30% differentially expressed genes due to long probe – poly(A)-tail distances , 2007, BMC Genomics.

[23]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.