MinReg: A Scalable Algorithm for Learning Parsimonious Regulatory Networks in Yeast and Mammals

In recent years, there has been a growing interest in applying Bayesian networks and their extensions to reconstruct regulatory networks from gene expression data. Since the gene expression domain involves a large number of variables and a limited number of samples, it poses both computational and statistical challenges to Bayesian network learning algorithms. Here we define a constrained family of Bayesian network structures suitable for this domain and devise an efficient search algorithm that utilizes these structural constraints to find high scoring networks from data. Interestingly, under reasonable assumptions on the underlying probability distribution, we can provide performance guarantees on our algorithm. Evaluation on real data from yeast and mouse, demonstrates that our method cannot only reconstruct a high quality model of the yeast regulatory network, but is also the first method to scale to the complexity of mammalian networks and successfully reconstructs a reasonable model over thousands of variables.

[1]  Gregory F. Cooper,et al.  Discovery of Causal Relationships in a Gene-Regulation Pathway from a Mixture of Experimental and Observational DNA Microarray Data , 2001, Pacific Symposium on Biocomputing.

[2]  Nir Friedman,et al.  Inferring subnetworks from perturbed expression profiles , 2001, ISMB.

[3]  Lani F. Wu,et al.  Large-scale prediction of Saccharomyces cerevisiae gene function using overlapping transcriptional clusters , 2002, Nature Genetics.

[4]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[5]  J. Collado-Vides,et al.  Identifying global regulators in transcriptional regulatory networks in bacteria. , 2003, Current opinion in microbiology.

[6]  Nir Friedman,et al.  Learning Module Networks , 2002, J. Mach. Learn. Res..

[7]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[8]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[9]  Daniel Lehmann,et al.  Combinatorial auctions with decreasing marginal utilities , 2001, EC '01.

[10]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[11]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[12]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[13]  Amos Tanay,et al.  Minreg: Inferring an active regulator set , 2002, ISMB.

[14]  Sanjoy Dasgupta,et al.  Learning Polytrees , 1999, UAI.

[15]  Yaniv Ziv,et al.  Revealing modular organization in the yeast transcriptional network , 2002, Nature Genetics.

[16]  Nicola J. Rinaldi,et al.  Computational discovery of gene modules and regulatory networks , 2003, Nature Biotechnology.

[17]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[18]  H. Bussemaker,et al.  Regulatory element detection using correlation with expression , 2001, Nature Genetics.

[19]  C. Ball,et al.  Saccharomyces Genome Database. , 2002, Methods in enzymology.

[20]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[21]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[22]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[23]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[24]  Nir Friedman,et al.  From promoter sequence to expression: a probabilistic framework , 2002, RECOMB '02.

[25]  Tommi S. Jaakkola,et al.  Combining Location and Expression Data for Principled Discovery of Genetic Regulatory Network Models , 2001, Pacific Symposium on Biocomputing.

[26]  Satoru Miyano,et al.  Estimation of Genetic Networks and Functional Structures Between Genes by Using Bayesian Networks and Nonparametric Regression , 2001, Pacific Symposium on Biocomputing.

[27]  S. Shen-Orr,et al.  Network motifs in the transcriptional regulation network of Escherichia coli , 2002, Nature Genetics.

[28]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[29]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1999, Innovations in Bayesian Networks.

[30]  Sangdun Choi,et al.  Unravelling the signal-transduction network in B lymphocytes , 2002, Nature.

[31]  David Page,et al.  Modelling regulatory pathways in E. coli from time series expression profiles , 2002, ISMB.