BMC Bioinformatics BioMed Central Methodology article Weighted analysis of general microarray experiments

BackgroundIn DNA microarray experiments, measurements from different biological samples are often assumed to be independent and to have identical variance. For many datasets these assumptions have been shown to be invalid and typically lead to too optimistic p-values. A method called WAME has been proposed where a variance is estimated for each sample and a covariance is estimated for each pair of samples. The current version of WAME is, however, limited to experiments with paired design, e.g. two-channel microarrays.ResultsThe WAME procedure is extended to general microarray experiments, making it capable of handling both one- and two-channel datasets. Two public one-channel datasets are analysed and WAME detects both unequal variances and correlations. WAME is compared to other common methods: fold-change ranking, ordinary linear model with t-tests, LIMMA and weighted LIMMA. The p-value distributions are shown to differ greatly between the examined methods. In a resampling-based simulation study, the p-values generated by WAME are found to be substantially more correct than the alternatives when a relatively small proportion of the genes is regulated. WAME is also shown to have higher power than the other methods. WAME is available as an R-package.ConclusionThe WAME procedure is generalized and the limitation to paired-design microarray datasets is removed. The examined other methods produce invalid p-values in many cases, while WAME is shown to produce essentially valid p-values when a relatively small proportion of genes is regulated. WAME is also shown to have higher power than the examined alternative methods.

[1]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[2]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[3]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[4]  R. Tibshirani,et al.  Outlier sums for differential gene expression analysis. , 2007, Biostatistics.

[5]  H. Anton Elementary Linear Algebra , 1970 .

[6]  Yudi Pawitan,et al.  Estimation of false discovery proportion under general dependence , 2006, Bioinform..

[7]  Erik Kristiansson,et al.  Quality Optimised Analysis of General Paired Microarray Experiments , 2006, Statistical applications in genetics and molecular biology.

[8]  Andrei Yakovlev,et al.  Treating Expression Levels of Different Genes as a Sample in Microarray Data Analysis: Is it Worth a Risk? , 2006, Statistical applications in genetics and molecular biology.

[9]  Arjun K. Gupta The Theory of Linear Models and Multivariate Analysis , 1981 .

[10]  B. Celli,et al.  Gene expression profiling of human lung tissue from smokers with severe emphysema. , 2004, American journal of respiratory cell and molecular biology.

[11]  Stefan Kääb,et al.  Reprogramming of the Human Atrial Transcriptome in Permanent Atrial Fibrillation: Expression of a Ventricular-Like Genomic Signature , 2005, Circulation research.

[12]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[13]  Ingrid Lönnstedt Replicated microarray data , 2001 .

[14]  J. Tchinda,et al.  Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. , 2006, Science.

[15]  Gordon K. Smyth,et al.  Empirical array quality weights in the analysis of microarray data , 2006, BMC Bioinformatics.

[16]  Aad van der Vaart,et al.  A Test for Partial Differential Expression , 2008 .

[17]  Erik Kristiansson,et al.  Weighted Analysis of Paired Microarray Experiments , 2005, Statistical applications in genetics and molecular biology.

[18]  Karl Kornacker,et al.  Chipping away at the chip bias: RNA degradation in microarray analysis , 2003, Nature Genetics.