An Integrative Approach to Infer Regulation Programs in a Transcription Regulatory Module Network

The module network method, a special type of Bayesian network algorithms, has been proposed to infer transcription regulatory networks from gene expression data. In this method, a module represents a set of genes, which have similar expression profiles and are regulated by same transcription factors. The process of learning module networks consists of two steps: first clustering genes into modules and then inferring the regulation program (transcription factors) of each module. Many algorithms have been designed to infer the regulation program of a given gene module, and these algorithms show very different biases in detecting regulatory relationships. In this work, we explore the possibility of integrating results from different algorithms. The integration methods we select are union, intersection, and weighted rank aggregation. Experiments in a yeast dataset show that the union and weighted rank aggregation methods produce more accurate predictions than those given by individual algorithms, whereas the intersection method does not yield any improvement in the accuracy of predictions. In addition, somewhat surprisingly, the union method, which has a lower computational cost than rank aggregation, achieves comparable results as given by rank aggregation.

[1]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1999, Innovations in Bayesian Networks.

[2]  Kathleen Marchal,et al.  Module networks revisited: computational assessment and prioritization of model predictions , 2009, Bioinform..

[3]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[4]  N. D. Clarke,et al.  Towards a Rigorous Assessment of Systems Biology Models: The DREAM3 Challenges , 2010, PloS one.

[5]  Alexandre P. Francisco,et al.  YEASTRACT-DISCOVERER: new tools to improve the analysis of transcriptional regulatory associations in Saccharomyces cerevisiae , 2007, Nucleic Acids Res..

[6]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[7]  Vasyl Pihur,et al.  RankAggreg, an R package for weighted rank aggregation , 2009, BMC Bioinformatics.

[8]  Ben Taskar,et al.  Rich probabilistic models for gene expression , 2001, ISMB.

[9]  Richard Bonneau,et al.  The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo , 2006, Genome Biology.

[10]  Kathleen Marchal,et al.  Comparative analysis of module-based versus direct methods for reverse-engineering transcriptional regulatory networks , 2009, BMC Systems Biology.

[11]  Vasyl Pihur,et al.  Weighted rank aggregation of cluster validation measures: a Monte Carlo cross-entropy approach , 2007, Bioinform..

[12]  Yves Van de Peer,et al.  Prediction of a gene regulatory network linked to prostate cancer from gene expression, microRNA and clinical data , 2010, Bioinform..

[13]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[14]  Yves Van de Peer,et al.  Analysis of a Gibbs sampler method for model-based clustering of gene expression data , 2008, Bioinform..

[15]  C. Kaiser,et al.  Nitrogen regulation in Saccharomyces cerevisiae. , 2002, Gene.

[16]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[17]  Gustavo Stolovitzky,et al.  Lessons from the DREAM2 Challenges , 2009, Annals of the New York Academy of Sciences.

[18]  J. Hasty,et al.  Reverse engineering gene networks: Integrating genetic perturbations with dynamical modeling , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Jing Li,et al.  Regulatory module network of basic/helix-loop-helix transcription factors in mouse brain , 2007, Genome Biology.

[20]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[21]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[22]  T. Cooper,et al.  The Level of DAL80 Expression Down-Regulates GATA Factor-Mediated Transcription inSaccharomyces cerevisiae , 2000, Journal of bacteriology.

[23]  Gregory Butler,et al.  A regression tree-based Gibbs sampler to learn the regulation programs in a transcription regulatory module network , 2010, 2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.