timeClip: pathway analysis for time course data without replicates

BackgroundTime-course gene expression experiments are useful tools for exploring biological processes. In this type of experiments, gene expression changes are monitored along time. Unfortunately, replication of time series is still costly and usually long time course do not have replicates. Many approaches have been proposed to deal with this data structure, but none of them in the field of pathway analysis. Pathway analyses have acquired great relevance for helping the interpretation of gene expression data. Several methods have been proposed to this aim: from the classical enrichment to the more complex topological analysis that gains power from the topology of the pathway. None of them were devised to identify temporal variations in time course data.ResultsHere we present timeClip, a topology based pathway analysis specifically tailored to long time series without replicates. timeClip combines dimension reduction techniques and graph decomposition theory to explore and identify the portion of pathways that is most time-dependent. In the first step, timeClip selects the time-dependent pathways; in the second step, the most time dependent portions of these pathways are highlighted. We used timeClip on simulated data and on a benchmark dataset regarding mouse muscle regeneration model. Our approach shows good performance on different simulated settings. On the real dataset, we identify 76 time-dependent pathways, most of which known to be involved in the regeneration process. Focusing on the 'mTOR signaling pathway' we highlight the timing of key processes of the muscle regeneration: from the early pathway activation through growth factor signals to the late burst of protein production needed for the fiber regeneration.ConclusionstimeClip represents a new improvement in the field of time-dependent pathway analysis. It allows to isolate and dissect pathways characterized by time-dependent components. Furthermore, using timeClip on a mouse muscle regeneration dataset we were able to characterize the process of muscle fiber regeneration with its correct timing.

[1]  J. Davis Bioinformatics and Computational Biology Solutions Using R and Bioconductor , 2007 .

[2]  Qi Liu,et al.  Gene-set analysis and reduction , 2008, Briefings Bioinform..

[3]  Xu Han,et al.  Identifying differentially expressed genes in Time-Course microarray Experiment without Replicate , 2007, J. Bioinform. Comput. Biol..

[4]  David M. Sabatini,et al.  The Rag GTPases Bind Raptor and Mediate Amino Acid Signaling to mTORC1 , 2008, Science.

[5]  U. Mansmann,et al.  Testing Differential Gene Expression in Functional Groups , 2005, Methods of Information in Medicine.

[6]  Pooja Mittal,et al.  A novel signaling pathway impact analysis , 2009, Bioinform..

[7]  R. Choksi,et al.  Nerve-dependent recovery of metabolic pathways in regenerating soleus muscles , 1994, Journal of Muscle Research & Cell Motility.

[8]  James J. Chen,et al.  Multivariate analysis of variance test for gene set analysis , 2009, Bioinform..

[9]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Jun S. Liu,et al.  Identifying Differentially Expressed Genes in Time Course Microarray Data , 2009 .

[11]  John D. Storey,et al.  Significance analysis of time course microarray experiments. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Peter Bühlmann,et al.  Analyzing gene expression data in terms of gene sets: methodological issues , 2007, Bioinform..

[13]  T. Jaakkola,et al.  Comparing the continuous representation of time-series expression profiles to identify differentially expressed genes , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Stephen C. Billups,et al.  Identifying significant temporal variation in time course microarray data without replicates , 2008, BMC Bioinformatics.

[15]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[16]  Qi Liu,et al.  BMC Bioinformatics BioMed Central Methodology article Comparative evaluation of gene-set analysis methods , 2007 .

[17]  Joaquín Dopazo,et al.  Functional assessment of time course microarray data , 2009, BMC Bioinformatics.

[18]  P. Park,et al.  Discovering statistically significant pathways in expression profiling studies. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Paola Sebastiani,et al.  Cluster analysis of gene expression dynamics , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Purvesh Khatri,et al.  Ontological analysis of gene expression data: current tools, limitations, and open problems , 2005, Bioinform..

[21]  Eric P Hoffman,et al.  Slug Is a Novel Downstream Target of MyoD , 2002, The Journal of Biological Chemistry.

[22]  Frank Emmert-Streib,et al.  The Chronic Fatigue Syndrome: A Comparative Pathway Analysis , 2007, J. Comput. Biol..

[23]  Christian Stockmann,et al.  Myeloid Hypoxia-Inducible Factor-1α Is Essential for Skeletal Muscle Regeneration in Mice , 2013, The Journal of Immunology.

[24]  Korbinian Strimmer,et al.  BMC Bioinformatics BioMed Central Methodology article A general modular framework for gene set enrichment analysis , 2009 .

[25]  P. Khatri,et al.  A systems biology approach for pathway level analysis. , 2007, Genome research.

[26]  Seon-Young Kim,et al.  Gene-set approach for expression pattern analysis , 2008, Briefings Bioinform..

[27]  J. Olson,et al.  A regression-based method to identify differentially expressed genes in microarray time course studies and its application in an inducible Huntington's disease transgenic model. , 2002, Human molecular genetics.

[28]  Wenguang Sun,et al.  Multiple Testing for Pattern Identification, With Applications to Microarray Time-Course Experiments , 2011 .

[29]  Alexander Schliep,et al.  Analyzing Gene Expression Time-Courses , 2005, IEEE ACM Trans. Comput. Biol. Bioinform..

[30]  Mark C K Yang,et al.  Identifying temporally differentially expressed genes through functional principal components analysis. , 2009, Biostatistics.

[31]  Gabriele Sales,et al.  graphite - a Bioconductor package to convert pathway topology to gene network , 2012, BMC Bioinformatics.

[32]  Paolo G. V. Martini,et al.  Graphite Web: web tool for gene set analysis exploiting pathway topology , 2013, Nucleic Acids Res..

[33]  J. Tidball,et al.  Regulatory interactions between muscle and the immune system during muscle regeneration. , 2010, American journal of physiology. Regulatory, integrative and comparative physiology.

[34]  Jangsun Baek,et al.  A modified correlation coefficient based similarity measure for clustering time-course gene expression data , 2008, Pattern Recognit. Lett..

[35]  F Hong,et al.  Functional Hierarchical Models for Identifying Genes with Different Time‐Course Expression Profiles , 2006, Biometrics.

[36]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[37]  Monica Chiogna,et al.  Gene set analysis exploiting the topology of a pathway , 2010, BMC Systems Biology.

[38]  E. Volpi,et al.  Mammalian target of rapamycin complex 1 activation is required for the stimulation of human skeletal muscle protein synthesis by essential amino acids. , 2011, The Journal of nutrition.

[39]  Douglas A. Hosack,et al.  Identifying biological themes within lists of genes with EASE , 2003, Genome Biology.

[40]  Taesung Park,et al.  Statistical tests for identifying differentially expressed genes in time-course microarray experiments , 2003, Bioinform..

[41]  S. Dudoit,et al.  Gains in Power from Structured Two-Sample Tests of Means on Graphs , 2010, 1009.5173.

[42]  Henning Hermjakob,et al.  R spider: a network-based analysis of gene lists by combining signaling and metabolic pathways from Reactome and KEGG databases , 2010, Nucleic Acids Res..

[43]  Qi Liu,et al.  Improving gene set analysis of microarray data by SAM-GS , 2007, BMC Bioinformatics.

[44]  Jelle J. Goeman,et al.  A global test for groups of genes: testing association with a clinical outcome , 2004, Bioinform..

[45]  Coffey Norma,et al.  Analyzing Time-Course Microarray Data Using Functional Data Analysis - A Review , 2011 .

[46]  Jane-Ling Wang,et al.  Identifying Differentially Expressed Genes for Time-course Microarray Data through Functional Data Analysis , 2010 .

[47]  Cengizhan Ozturk,et al.  Pathway analysis of high-throughput biological data within a Bayesian network framework , 2011, Bioinform..

[48]  Monica Chiogna,et al.  Along signal paths: an empirical gene set approach exploiting pathway topology , 2012, Nucleic acids research.

[49]  X. Bigard,et al.  Recovery of skeletal muscle mass after extensive injury: positive effects of increased contractile activity. , 2008, American journal of physiology. Cell physiology.

[50]  T. Speed,et al.  A multivariate empirical Bayes statistic for replicated microarray time course data , 2006, math/0702685.

[51]  Trey Ideker,et al.  Cytoscape 2.8: new features for data integration and network visualization , 2010, Bioinform..

[52]  Shuang Wu,et al.  More powerful significant testing for time course gene expression data using functional principal component analysis approaches , 2012, BMC Bioinformatics.

[53]  R. Myers,et al.  Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data , 2005, Nucleic acids research.

[54]  Sandrine Dudoit,et al.  More power via graph-structured tests for differential expression of gene networks , 2012, 1206.6980.

[55]  Ilya Shmulevich,et al.  ProbCD: enrichment analysis accounting for categorization uncertainty , 2007 .

[56]  Steffen L. Lauritzen,et al.  Graphical models in R , 1996 .

[57]  Christina Kendziorski,et al.  Hidden Markov Models for Microarray Time Course Data in Multiple Biological Conditions , 2006 .