A Unified Mixed Effects Model for Gene Set Analysis of Time Course Microarray Experiments

Methods for gene set analysis test for coordinated changes of a group of genes involved in the same biological process or molecular pathway. Higher statistical power is gained for gene set analysis by combining weak signals from a number of individual genes in each group. Although many gene set analysis methods have been proposed for microarray experiments with two groups, few can be applied to time course experiments. We propose a unified statistical model for analyzing time course experiments at the gene set level using random coefficient models, which fall into the more general class of mixed effects models. These models include a systematic component that models the mean trajectory for the group of genes, and a random component (the random coefficients) that models how each gene's trajectory varies about the mean trajectory. We show that the proposed model (1) outperforms currently available methods at discriminating gene sets differentially changed over time from null gene sets; (2) provides more stable results that are less affected by sampling variations; (3) models dependency among genes adequately and preserves type I error rate; and (4) allows for gene ranking based on predicted values of the random effects. We describe simulation studies using gene expression data with "real life" correlations and we demonstrate the proposed random coefficient model using a mouse colon development time course dataset. The agreement between results of the proposed random coefficient model and the previous reports for this proof-of-concept trial further validates this methodology, which provides a unified statistical model for systems analysis of microarray experiments with complex experimental designs when re-sampling based methods are difficult to apply.

[1]  Purvesh Khatri,et al.  Onto-Tools, the toolkit of the modern biologist: Onto-Express, Onto-Compare, Onto-Design and Onto-Translate , 2003, Nucleic Acids Res..

[2]  Bing Zhang,et al.  GOTree Machine (GOTM): a web-based platform for interpreting sets of interesting genes using Gene Ontology hierarchies , 2004, BMC Bioinformatics.

[3]  Andrew B. Nobel,et al.  Significance analysis of functional categories in gene expression studies: a structured permutation approach , 2005, Bioinform..

[4]  Ulrich Mansmann,et al.  GlobalANCOVA: exploration and assessment of gene group effects , 2008, Bioinform..

[5]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[6]  Pierre R. Bushel,et al.  Assessing Gene Significance from cDNA Microarray Expression Data via Mixed Models , 2001, J. Comput. Biol..

[7]  S. Wulff SAS for Mixed Models , 2007 .

[8]  Jelle J. Goeman,et al.  A global test for groups of genes: testing association with a clinical outcome , 2004, Bioinform..

[9]  M. Washington,et al.  Gene expression profile analysis of mouse colon embryonic development , 2005, Genesis.

[10]  Ian T. Jolliffe,et al.  Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.

[11]  Jiajun Liu,et al.  Domain-enhanced analysis of microarray data using GO annotations , 2007, Bioinform..

[12]  Xi Chen,et al.  Supervised principal component analysis for gene set enrichment of microarray data with continuous or survival outcomes , 2008, Bioinform..

[13]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[14]  R W Doerge,et al.  Naive Application of Permutation Testing Leads to Inflated Type I Error Rates , 2008, Genetics.

[15]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[16]  Steven C. Lawlor,et al.  GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways , 2002, Nature Genetics.

[17]  Bing Zhang,et al.  WebGestalt: an integrated system for exploring gene sets in various biological contexts , 2005, Nucleic Acids Res..

[18]  Taesung Park,et al.  Statistical tests for identifying differentially expressed genes in time-course microarray experiments , 2003, Bioinform..

[19]  M. Daly,et al.  PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes , 2003, Nature Genetics.

[20]  Seon-Young Kim,et al.  PAGE: Parametric Analysis of Gene Set Enrichment , 2005, BMC Bioinform..

[21]  John D. Storey,et al.  Significance analysis of time course microarray experiments. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[22]  R. O. Stuart,et al.  Changes in global gene expression patterns during development and maturation of the rat kidney , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[23]  B. Weir,et al.  A systematic statistical linear modeling approach to oligonucleotide array experiments. , 2002, Mathematical biosciences.

[24]  Bing Zhang,et al.  An Integrated Approach for the Analysis of Biological Pathways using Mixed Models , 2008, PLoS genetics.

[25]  Hongzhe Li,et al.  Model-based methods for identifying periodically expressed genes based on time course microarray gene expression data , 2004, Bioinform..