Harnessing Diversity towards the Reconstructing of Large Scale Gene Regulatory Networks

Elucidating gene regulatory network (GRN) from large scale experimental data remains a central challenge in systems biology. Recently, numerous techniques, particularly consensus driven approaches combining different algorithms, have become a potentially promising strategy to infer accurate GRNs. Here, we develop a novel consensus inference algorithm, TopkNet that can integrate multiple algorithms to infer GRNs. Comprehensive performance benchmarking on a cloud computing framework demonstrated that (i) a simple strategy to combine many algorithms does not always lead to performance improvement compared to the cost of consensus and (ii) TopkNet integrating only high-performance algorithms provide significant performance improvement compared to the best individual algorithms and community prediction. These results suggest that a priori determination of high-performance algorithms is a key to reconstruct an unknown regulatory network. Similarity among gene-expression datasets can be useful to determine potential optimal algorithms for reconstruction of unknown regulatory networks, i.e., if expression-data associated with known regulatory network is similar to that with unknown regulatory network, optimal algorithms determined for the known regulatory network can be repurposed to infer the unknown regulatory network. Based on this observation, we developed a quantitative measure of similarity among gene-expression datasets and demonstrated that, if similarity between the two expression datasets is high, TopkNet integrating algorithms that are optimal for known dataset perform well on the unknown dataset. The consensus framework, TopkNet, together with the similarity measure proposed in this study provides a powerful strategy towards harnessing the wisdom of the crowds in reconstruction of unknown regulatory networks.

[1]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[2]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[3]  Michael A. Langston,et al.  Reconstructing Generalized Logical Networks of Transcriptional Regulation in Mouse Brain from Temporal Gene Expression Data , 2009, EURASIP J. Bioinform. Syst. Biol..

[4]  Frank Emmert-Streib,et al.  Revealing differences in gene network inference algorithms on the network level by ensemble methods , 2010, Bioinform..

[5]  Kevin Kontos,et al.  Information-Theoretic Inference of Large Transcriptional Regulatory Networks , 2007, EURASIP J. Bioinform. Syst. Biol..

[6]  Michael P. H. Stumpf,et al.  Statistical inference of the time-varying structure of gene-regulation networks , 2010, BMC Systems Biology.

[7]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[8]  Tian Zheng,et al.  Inference of Regulatory Gene Interactions from Expression Data Using Three‐Way Mutual Information , 2009, Annals of the New York Academy of Sciences.

[9]  Gianluca Bontempi,et al.  minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information , 2008, BMC Bioinformatics.

[10]  A. Barabasi,et al.  Interactome Networks and Human Disease , 2011, Cell.

[11]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[12]  Kevin Y. Yip,et al.  Improved Reconstruction of In Silico Gene Regulatory Networks by Integrating Knockout and Perturbation Data , 2010, PloS one.

[13]  A. Sîrbu,et al.  Stages of Gene Regulatory Network Inference: the Evolutionary Algorithm Role , 2011 .

[14]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[15]  D. Floreano,et al.  Revealing strengths and weaknesses of methods for gene network inference , 2010, Proceedings of the National Academy of Sciences.

[16]  Frank Emmert-Streib,et al.  Inferring the conservative causal core of gene regulatory networks , 2010, BMC Systems Biology.

[17]  Constantin F. Aliferis,et al.  Time and sample efficient discovery of Markov blankets and direct causal relations , 2003, KDD '03.

[18]  Constantin F. Aliferis,et al.  Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation , 2010, J. Mach. Learn. Res..

[19]  Ralf Zimmer,et al.  Inferring gene regulatory networks by ANOVA , 2012, Bioinform..

[20]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[21]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[22]  Ron Shamir,et al.  Constructing Logical Models of Gene Regulatory Networks by Integrating Transcription Factor-DNA Interactions with Expression Data: An Entropy-Based Approach , 2012, J. Comput. Biol..

[23]  Adrian E. Raftery,et al.  Bayesian model averaging: development of an improved multi-class, gene selection and classification tool for microarray data , 2005, Bioinform..

[24]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..

[25]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[26]  Jean-Philippe Vert,et al.  TIGRESS: Trustful Inference of Gene REgulation using Stability Selection , 2012, BMC Systems Biology.

[27]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[28]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[29]  Richard Bonneau,et al.  DREAM4: Combining Genetic and Dynamic Information to Identify Biological Networks and Dynamical Models , 2010, PloS one.

[30]  Riet De Smet,et al.  Advantages and limitations of current network inference methods , 2010, Nature Reviews Microbiology.

[31]  Paul P. Wang,et al.  Advances to Bayesian network inference for generating causal networks from observational biological data , 2004, Bioinform..

[32]  Mohammad Asim,et al.  Differential C3NET reveals disease networks of direct physical interactions , 2011, BMC Bioinformatics.

[33]  G. Altay,et al.  Structural influence of gene networks on their inference: analysis of C3NET. , 2011 .

[34]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[35]  A. Califano,et al.  Dialogue on Reverse‐Engineering Assessment and Methods , 2007, Annals of the New York Academy of Sciences.

[36]  Marcel J. T. Reinders,et al.  Least absolute regression network analysis of the murine osteoblast differentiation network , 2006, Bioinform..

[37]  Dario Floreano,et al.  GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods , 2011, Bioinform..

[38]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[39]  Holger Schwender,et al.  Bibliography Reverse Engineering Genetic Networks Using the Genenet Package , 2006 .

[40]  Constantin F. Aliferis,et al.  Analysis and Computational Dissection of Molecular Signature Multiplicity , 2010, PLoS Comput. Biol..

[41]  D. di Bernardo,et al.  How to infer gene networks from expression profiles , 2007, Molecular systems biology.

[42]  Gregory F. Cooper,et al.  Causal Discovery Using A Bayesian Local Causal Discovery Algorithm , 2004, MedInfo.