Biomarker detection in the integration of multiple multi-class genomic studies

MOTIVATION Systematic information integration of multiple-related microarray studies has become an important issue as the technology becomes mature and prevalent in the past decade. The aggregated information provides more robust and accurate biomarker detection. So far, published meta-analysis methods for this purpose mostly consider two-class comparison. Methods for combining multi-class studies and considering expression pattern concordance are rarely explored. RESULTS In this article, we develop three integration methods for biomarker detection in multiple multi-class microarray studies: ANOVA-maxP, min-MCC and OW-min-MCC. We first consider a natural extension of combining P-values from the traditional ANOVA model. Since P-values from ANOVA do not guarantee to reflect the concordant expression pattern information across studies, we propose a multi-class correlation (MCC) measure to specifically seek for biomarkers of concordant inter-class patterns across a pair of studies. For both ANOVA and MCC approaches, we use extreme order statistics to identify biomarkers differentially expressed (DE) in all studies (i.e. ANOVA-maxP and min-MCC). The min-MCC method is further extended to identify biomarkers DE in partial studies by incorporating a recently developed optimally weighted (OW) technique (OW-min-MCC). All methods are evaluated by simulation studies and by three meta-analysis applications to multi-tissue mouse metabolism datasets, multi-condition mouse trauma datasets and multi-malignant-condition human prostate cancer datasets. The results show complementary strength of the three methods for different biological purposes. AVAILABILITY http://www.biostat.pitt.edu/bioinfo/. SUPPLEMENTARY INFORMATION Supplementary data is available at Bioinformatics online.

[1]  S. Dhanasekaran,et al.  Delineation of prognostic biomarkers in prostate cancer , 2001, Nature.

[2]  Sangsoo Kim,et al.  Combining multiple microarray studies and modeling interstudy variation , 2003, ISMB.

[3]  Giovanni Parmigiani,et al.  A Cross-Study Comparison of Gene Expression Studies for the Molecular Classification of Lung Cancer , 2004, Clinical Cancer Research.

[4]  Chung-Yen Lin,et al.  POWER: PhylOgenetic WEb Repeater—an integrated and user-optimized framework for biomolecular phylogenetic analysis , 2005, Nucleic Acids Res..

[5]  B WILKINSON,et al.  A statistical consideration in psychological research. , 1951, Psychological bulletin.

[6]  T. Barrette,et al.  Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. , 2002, Cancer research.

[7]  RAINER BREITLING,et al.  Rank-based Methods as a Non-parametric Alternative of the T-statistic for the Analysis of Biological Microarray Data , 2005, J. Bioinform. Comput. Biol..

[8]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[9]  John T. Wei,et al.  Integrative genomic and proteomic analysis of prostate cancer reveals signatures of metastatic progression. , 2005, Cancer cell.

[10]  Rainer Breitling,et al.  RankProd: a bioconductor package for detecting differentially expressed genes in meta-analysis , 2006, Bioinform..

[11]  R. Fisher Statistical methods for research workers , 1927, Protoplasma.

[12]  Allan Birnbaum,et al.  Combining Independent Tests of Significance , 1954 .

[13]  Debashis Ghosh,et al.  Statistical issues and methods for meta-analysis of microarray data: a case study in prostate cancer , 2003, Functional & Integrative Genomics.

[14]  Anna Liu,et al.  Bayesian meta-analysis models for microarray data: a comparative study , 2007, BMC Bioinformatics.

[15]  Sergio Contrino,et al.  ArrayExpress—a public repository for microarray gene expression data at the EBI , 2004, Nucleic Acids Res..

[16]  B. De Moor,et al.  Comparison and meta-analysis of microarray data: from the bench to the computer desk. , 2003, Trends in genetics : TIG.

[17]  R. Tibshirani,et al.  Gene expression profiling identifies clinically relevant subtypes of prostate cancer. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[18]  M. Oh,et al.  Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. , 2001, Nucleic acids research.

[19]  Dong Wan Shin,et al.  Identifying differentially expressed genes in meta-analysis via Bayesian model-based clustering. , 2006, Biometrical journal. Biometrische Zeitschrift.

[20]  John R. Stevens,et al.  Combining Affymetrix microarray results , 2005, BMC Bioinformatics.

[21]  David Botstein,et al.  The Stanford Microarray Database , 2001, Nucleic Acids Res..

[22]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Deepayan Sarkar,et al.  Detecting differential gene expression with a semiparametric hierarchical mixture method. , 2004, Biostatistics.

[24]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[25]  I Olkin,et al.  Approximations for trimmed Fisher procedures in research synthesis , 2001, Statistical methods in medical research.