A benchmark for Affymetrix GeneChip expression measures

MOTIVATION The defining feature of oligonucleotide expression arrays is the use of several probes to assay each targeted transcript. This is a bonanza for the statistical geneticist, who can create probeset summaries with specific characteristics. There are now several methods available for summarizing probe level data from the popular Affymetrix GeneChips, but it is difficult to identify the best method for a given inquiry. RESULTS We have developed a graphical tool to evaluate summaries of Affymetrix probe level data. Plots and summary statistics offer a picture of how an expression measure performs in several important areas. This picture facilitates the comparison of competing expression measures and the selection of methods suitable for a specific investigation. The key is a benchmark data set consisting of a dilution study and a spike-in study. Because the truth is known for these data, we can identify statistical features of the data for which the expected outcome is known in advance. Those features highlighted in our suite of graphs are justified by questions of biological interest and motivated by the presence of appropriate data.

[1]  D. Slonim,et al.  Evaluation of normalization procedures for oligonucleotide array data based on spiked cRNA controls , 2001, Genome Biology.

[2]  E. Brown,et al.  Quantitative analysis of mRNA amplification by in vitro transcription. , 2001, Nucleic acids research.

[3]  Fred A. Wright,et al.  Theoretical and experimental comparisons of gene expression indexes for oligonucleotide arrays , 2002, Bioinform..

[4]  D. Lockhart,et al.  Expression monitoring by hybridization to high-density oligonucleotide arrays , 1996, Nature Biotechnology.

[5]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[7]  Russell D. Wolfinger,et al.  Who Are Those Strangers in the Latin Square , 2004 .

[8]  E. Brown,et al.  Genomic analysis of gene expression in C. elegans. , 2000, Science.

[9]  S. Sealfon,et al.  Accuracy and calibration of commercial oligonucleotide and custom cDNA microarrays. , 2002, Nucleic acids research.

[10]  Felix Naef,et al.  From features to expression: High-density oligonucleotide array analysis revisited , 2001 .

[11]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[12]  Vladimir Svetnik,et al.  STATISTICAL ANALYSIS OF HIGH DENSITY OLIGONUCLEOTIDE ARRAYS: A SAFER APPROACH , 2001 .

[13]  T. Speed,et al.  Summaries of Affymetrix GeneChip probe level data. , 2003, Nucleic acids research.

[14]  Kristina Hanspers,et al.  Spotted long oligonucleotide arrays for human gene expression analysis. , 2003, Genome research.

[15]  S. Knudsen,et al.  A new non-linear normalization method for reducing variability in DNA microarray experiments , 2002, Genome Biology.