Data-driven normalization strategies for high-throughput quantitative RT-PCR

BackgroundHigh-throughput real-time quantitative reverse transcriptase polymerase chain reaction (qPCR) is a widely used technique in experiments where expression patterns of genes are to be profiled. Current stage technology allows the acquisition of profiles for a moderate number of genes (50 to a few thousand), and this number continues to grow. The use of appropriate normalization algorithms for qPCR-based data is therefore a highly important aspect of the data preprocessing pipeline.ResultsWe present and evaluate two data-driven normalization methods that directly correct for technical variation and represent robust alternatives to standard housekeeping gene-based approaches. We evaluated the performance of these methods against a single gene housekeeping gene method and our results suggest that quantile normalization performs best. These methods are implemented in freely-available software as an R package qpcrNorm distributed through the Bioconductor project.ConclusionThe utility of the approaches that we describe can be demonstrated most clearly in situations where standard housekeeping genes are regulated by some experimental condition. For large qPCR-based data sets, our approaches represent robust, data-driven strategies for normalization.

[1]  J. Warrington,et al.  Comparison of human adult and fetal expression and identification of 535 housekeeping/maintenance genes. , 2000, Physiological genomics.

[2]  J. Davis Bioinformatics and Computational Biology Solutions Using R and Bioconductor , 2007 .

[3]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[4]  Z. Arany High‐Throughput Quantitative Real‐Time PCR , 2008, Current protocols in human genetics.

[5]  H. D. Vanguilder,et al.  Twenty-five years of quantitative PCR for gene expression analysis. , 2008, BioTechniques.

[6]  Thomas D. Schmittgen,et al.  Effect of experimental treatment on housekeeping gene expression: validation by real-time, quantitative RT-PCR. , 2000, Journal of biochemical and biophysical methods.

[7]  P. J. Higgins,et al.  Control selection for RNA quantitation. , 2000, BioTechniques.

[8]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[9]  S Rozen,et al.  Primer3 on the WWW for general users and for biologist programmers. , 2000, Methods in molecular biology.

[10]  Ramesh Ramakrishnan,et al.  High Throughput Gene Expression Measurement with Real Time PCR in a Microfluidic Dynamic Array , 2008, PloS one.

[11]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[12]  Anoop Grewal,et al.  Analysis of Expression Data: An Overview , 2007, Current protocols in bioinformatics.

[13]  Mario Pazzagli,et al.  Quantitative real-time reverse transcription polymerase chain reaction: normalization to rRNA or single housekeeping genes is inappropriate for human tissue biopsies. , 2002, Analytical biochemistry.

[14]  N S Williams,et al.  Detection of cytokeratins 19/20 and guanylyl cyclase C in peripheral blood of colorectal cancer patients , 1999, British Journal of Cancer.

[15]  J. Warrington,et al.  Identification and validation of endogenous reference genes for expression profiling of T helper cell differentiation by quantitative real-time RT-PCR. , 2001, Analytical biochemistry.

[16]  M. Oh,et al.  Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. , 2001, Nucleic acids research.

[17]  D. Huhn,et al.  Highly sensitive and specific fluorescence reverse transcription-PCR assay for the pseudogene-free detection of beta-actin transcripts as quantitative reference. , 1999, Clinical chemistry.

[18]  T. Grisar,et al.  Housekeeping genes as internal standards: use and limits. , 1999, Journal of biotechnology.

[19]  Rafael A. Irizarry,et al.  Bioinformatics and Computational Biology Solutions using R and Bioconductor , 2005 .

[20]  W. Pearson,et al.  Current Protocols in Bioinformatics , 2002 .

[21]  E. Kroon,et al.  The housekeeping gene glyceraldehyde-3-phosphate dehydrogenase is inappropriate as internal control in comparative studies between skin tissue and cultured skin fibroblasts using Northern blot analysis , 1999, Archives of Dermatological Research.

[22]  N. Dracopoli,et al.  Current protocols in human genetics , 1994 .

[23]  F. Speleman,et al.  Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes , 2002, Genome Biology.