BGX: a fully Bayesian integrated approach to the analysis of Affymetrix GeneChip data.

We present Bayesian hierarchical models for the analysis of Affymetrix GeneChip data. The approach we take differs from other available approaches in two fundamental aspects. Firstly, we aim to integrate all processing steps of the raw data in a common statistically coherent framework, allowing all components and thus associated errors to be considered simultaneously. Secondly, inference is based on the full posterior distribution of gene expression indices and derived quantities, such as fold changes or ranks, rather than on single point estimates. Measures of uncertainty on these quantities are thus available. The models presented represent the first building block for integrated Bayesian Analysis of Affymetrix GeneChip data: the models take into account additive as well as multiplicative error, gene expression levels are estimated using perfect match and a fraction of mismatch probes and are modeled on the log scale. Background correction is incorporated by modeling true signal and cross-hybridization explicitly, and a need for further normalization is considerably reduced by allowing for array-specific distributions of nonspecific hybridization. When replicate arrays are available for a condition, posterior distributions of condition-specific gene expression indices are estimated directly, by a simultaneous consideration of replicate probe sets, avoiding averaging over estimates obtained from individual replicate arrays. The performance of the Bayesian model is compared to that of standard available point estimate methods on subsets of the well known GeneLogic and Affymetrix spike-in data. The Bayesian model is found to perform well and the integrated procedure presented appears to hold considerable promise for further development.

[1]  C. Li,et al.  Analyzing high‐density oligonucleotide gene expression array data , 2001, Journal of cellular biochemistry.

[2]  Tommi S. Jaakkola,et al.  Maximum-likelihood estimation of optimal scaling factors for expression array normalization , 2001, SPIE BiOS.

[3]  Wei-Min Liu,et al.  Robust estimators for expression analysis , 2002, Bioinform..

[4]  Benjamin M. Bolstad,et al.  affy - analysis of Affymetrix GeneChip data at the probe level , 2004, Bioinform..

[5]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[6]  Felix Naef,et al.  Empirical characterization of the expression ratio noise structure in high-density oligonucleotide arrays , 2002, Genome Biology.

[7]  N. Patil,et al.  DNA hybridization to mismatched templates: a chip study. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Sandya Liyanarachchi,et al.  A high performance test of differential gene expression for oligonucleotide arrays , 2003, Genome Biology.

[9]  Rafael A. Irizarry,et al.  A Model-Based Background Adjustment for Oligonucleotide Expression Arrays , 2004 .

[10]  Alex Lewin,et al.  Supplemental Material for Bayesian Modelling of Difierential Gene Expression , 2005 .

[11]  T. Speed,et al.  Summaries of Affymetrix GeneChip probe level data. , 2003, Nucleic acids research.

[12]  David M. Rocke,et al.  Transformation and normalization of oligonucleotide microarray data , 2003, Bioinform..

[13]  Terence P. Speed,et al.  A benchmark for Affymetrix GeneChip expression measures , 2004, Bioinform..

[14]  C. Li,et al.  Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.