Inferring cluster-based networks from differently stimulated multiple time-course gene expression data

Motivation: Clustering and gene network inference often help to predict the biological functions of gene subsets. Recently, researchers have accumulated a large amount of time-course transcriptome data collected under different treatment conditions to understand the physiological states of cells in response to extracellular stimuli and to identify drug-responsive genes. Although a variety of statistical methods for clustering and inferring gene networks from expression profiles have been proposed, most of these are not tailored to simultaneously treat expression data collected under multiple stimulation conditions. Results: We propose a new statistical method for analyzing temporal profiles under multiple experimental conditions. Our method simultaneously performs clustering of temporal expression profiles and inference of regulatory relationships among gene clusters. We applied this method to MCF7 human breast cancer cells treated with epidermal growth factor and heregulin which induce cellular proliferation and differentiation, respectively. The results showed that the method is useful for extracting biologically relevant information. Availability: A MATLAB implementation of the method is available from http://csb.gsc.riken.jp/yshira/software/clusterNetwork.zip Contact: yshira@riken.jp Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Joydeep Ghosh,et al.  A Unified Framework for Model-based Clustering , 2003, J. Mach. Learn. Res..

[2]  Satoru Miyano,et al.  Inferring gene networks from time series microarray data using dynamic Bayesian networks , 2003, Briefings Bioinform..

[3]  Shuhei Kimura,et al.  Genetic network inference as a series of discrimination tasks , 2009, Bioinform..

[4]  R. Yoshida,et al.  Finding module-based gene networks with state-space models - Mining high-dimensional and short time-course gene expression data , 2007, IEEE Signal Processing Magazine.

[5]  Hongzhe Li,et al.  Clustering of time-course gene expression data using a mixed-effects model with B-splines , 2003, Bioinform..

[6]  H. Akaike A new look at the statistical model identification , 1974 .

[7]  Jean-Loup Faulon,et al.  Boolean dynamics of genetic regulatory networks inferred from microarray time series data , 2007, Bioinform..

[8]  Susana R. Neves,et al.  Design Logic of a Cannabinoid Receptor Signaling Network That Triggers Neurite Outgrowth , 2008, Science.

[9]  Fang-Xiang Wu,et al.  Dynamic Model-based Clustering for Time-course Gene Expression Data , 2005, J. Bioinform. Comput. Biol..

[10]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[11]  Aurélien Mazurie,et al.  Gene networks inference using dynamic Bayesian networks , 2003, ECCB.

[12]  Zoubin Ghahramani,et al.  Modeling T-cell activation using gene expression profiling and state-space models , 2004, Bioinform..

[13]  Hiroyuki Toh,et al.  Inference of a genetic network by a combined approach of cluster analysis and graphical Gaussian modeling , 2002, Bioinform..

[14]  Xin Chen,et al.  TRANSFAC: an integrated system for gene expression regulation , 2000, Nucleic Acids Res..

[15]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[16]  Walter Kolch,et al.  Identification of the Mechanisms Regulating the Differential Activation of the MAPK Cascade by Epidermal Growth Factor and Nerve Growth Factor in PC12 Cells* , 2001, The Journal of Biological Chemistry.

[17]  Paola Sebastiani,et al.  Cluster analysis of gene expression dynamics , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Lurdes Y T Inoue,et al.  Cluster-based network model for time-course gene expression data. , 2007, Biostatistics.

[19]  W. Krzanowski,et al.  A Criterion for Determining the Number of Groups in a Data Set Using Sum-of-Squares Clustering , 1988 .

[20]  Douglas A. Lauffenburger,et al.  Common effector processing mediates cell-specific responses to stimuli , 2007, Nature.

[21]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[22]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[23]  Riccardo Bellazzi,et al.  TimeClust: a clustering tool for gene expression time series , 2008, Bioinform..

[24]  Eytan Domany,et al.  A module of negative feedback regulators defines growth factor signaling , 2007, Nature Genetics.

[25]  Andrew Harvey,et al.  Forecasting, Structural Time Series Models and the Kalman Filter , 1990 .

[26]  Erin L. McDearmon,et al.  Circadian and CLOCK-controlled regulation of the mouse transcriptome and cell proliferation , 2007, Proceedings of the National Academy of Sciences.

[27]  Geoffrey E. Hinton,et al.  Parameter estimation for linear dynamical systems , 1996 .

[28]  Riccardo Bellazzi,et al.  Random Walk Models for Bayesian Clustering of Gene Expression Profiles , 2005, Applied bioinformatics.

[29]  Ambuj K. Singh,et al.  Deriving phylogenetic trees from the similarity analysis of metabolic pathways , 2003, ISMB.

[30]  Zoubin Ghahramani,et al.  A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.

[31]  Nir Friedman,et al.  Learning Module Networks , 2002, J. Mach. Learn. Res..

[32]  Masaru Tomita,et al.  Dynamic modeling of genetic networks using genetic algorithm and S-system , 2003, Bioinform..

[33]  Catherine A. Sugar,et al.  Finding the Number of Clusters in a Dataset , 2003 .

[34]  E. Mccleskey,et al.  Role of Phosphoinositide 3-Kinase and Endocytosis in Nerve Growth Factor-Induced Extracellular Signal-Regulated Kinase Activation via Ras and Rap1 , 2000, Molecular and Cellular Biology.

[35]  Thomas Lengauer,et al.  Computational epigenetics , 2008, Bioinform..

[36]  Satoru Miyano,et al.  Estimation of Genetic Networks and Functional Structures Between Genes by Using Bayesian Networks and Nonparametric Regression , 2001, Pacific Symposium on Biocomputing.

[37]  Satoru Miyano,et al.  Statistical inference of transcriptional module-based gene networks from time course gene expression profiles by using state space models , 2008, Bioinform..

[38]  Shuhei Kimura,et al.  Inference of S-system models of genetic networks using a cooperative coevolutionary algorithm , 2005, Bioinform..

[39]  Min Zou,et al.  A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data , 2005, Bioinform..

[40]  Geoffrey E. Hinton,et al.  SMEM Algorithm for Mixture Models , 1998, Neural Computation.

[41]  Zoubin Ghahramani,et al.  A Bayesian approach to reconstructing genetic regulatory networks with hidden factors , 2005, Bioinform..

[42]  T. Speed,et al.  GOstat: find statistically overrepresented Gene Ontologies within a group of genes. , 2004, Bioinformatics.

[43]  G. Schwarz Estimating the Dimension of a Model , 1978 .