CMIP: a software package capable of reconstructing genome-wide regulatory networks using gene expression data

BackgroundA gene regulatory network (GRN) represents interactions of genes inside a cell or tissue, in which vertexes and edges stand for genes and their regulatory interactions respectively. Reconstruction of gene regulatory networks, in particular, genome-scale networks, is essential for comparative exploration of different species and mechanistic investigation of biological processes. Currently, most of network inference methods are computationally intensive, which are usually effective for small-scale tasks (e.g., networks with a few hundred genes), but are difficult to construct GRNs at genome-scale.ResultsHere, we present a software package for gene regulatory network reconstruction at a genomic level, in which gene interaction is measured by the conditional mutual information measurement using a parallel computing framework (so the package is named CMIP). The package is a greatly improved implementation of our previous PCA-CMI algorithm. In CMIP, we provide not only an automatic threshold determination method but also an effective parallel computing framework for network inference. Performance tests on benchmark datasets show that the accuracy of CMIP is comparable to most current network inference methods. Moreover, running tests on synthetic datasets demonstrate that CMIP can handle large datasets especially genome-wide datasets within an acceptable time period. In addition, successful application on a real genomic dataset confirms its practical applicability of the package.ConclusionsThis new software package provides a powerful tool for genomic network reconstruction to biological community. The software can be accessed at http://www.picb.ac.cn/CMIP/.

[1]  Martin Vingron,et al.  Reconstruction of gene regulatory network related to photosynthesis in Arabidopsis thaliana , 2014, Front. Plant Sci..

[2]  Adam A. Margolin,et al.  Reverse engineering cellular networks , 2006, Nature Protocols.

[3]  Georg Hager,et al.  Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes , 2009, 2009 17th Euromicro International Conference on Parallel, Distributed and Network-based Processing.

[4]  W. Lim,et al.  Defining Network Topologies that Can Achieve Biochemical Adaptation , 2009, Cell.

[5]  D. Bernardo,et al.  A Yeast Synthetic Network for In Vivo Assessment of Reverse-Engineering and Modeling Approaches , 2009, Cell.

[6]  Xiuwei Zhang,et al.  Refining Regulatory Networks through Phylogenetic Transfer of Information , 2012, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[7]  Sarel J Fleishman,et al.  Comment on "Network Motifs: Simple Building Blocks of Complex Networks" and "Superfamilies of Evolved and Designed Networks" , 2004, Science.

[8]  Antoine Allard,et al.  A system-level model for the microbial regulatory genome , 2014, Molecular systems biology.

[9]  D. Floreano,et al.  Revealing strengths and weaknesses of methods for gene network inference , 2010, Proceedings of the National Academy of Sciences.

[10]  Luonan Chen,et al.  Part mutual information for quantifying direct associations in networks , 2016, Proceedings of the National Academy of Sciences.

[11]  Kazuyuki Aihara,et al.  Detecting early-warning signals for sudden deterioration of complex diseases by dynamical network biomarkers , 2012, Scientific Reports.

[12]  Xing-Ming Zhao,et al.  Inferring gene regulatory networks from gene expression data by path consistency algorithm based on conditional mutual information , 2012, Bioinform..

[13]  Nir Friedman,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004, Science.

[14]  Constantin F. Aliferis,et al.  A Novel Algorithm for Scalable and Accurate Bayesian Network Learning , 2004, MedInfo.

[15]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[16]  Antti Honkela,et al.  Model-based method for transcription factor target identification with limited data , 2010, Proceedings of the National Academy of Sciences.

[17]  Anne-Laure Boulesteix,et al.  Regularized estimation of large-scale gene association networks using graphical Gaussian models , 2009, BMC Bioinformatics.

[18]  David Sankoff,et al.  The pineapple genome and the evolution of CAM photosynthesis , 2015, Nature Genetics.

[19]  Min Zou,et al.  A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data , 2005, Bioinform..

[20]  Richard Bonneau,et al.  The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo , 2006, Genome Biology.

[21]  Dan Braha,et al.  The Topology of Large-Scale Engineering Problem-Solving Networks , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[23]  Gary D. Stormo,et al.  Modeling Regulatory Networks with Weight Matrices , 1998, Pacific Symposium on Biocomputing.

[24]  Gianluca Bontempi,et al.  minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information , 2008, BMC Bioinformatics.

[25]  J. Collins,et al.  Inferring Genetic Networks and Identifying Compound Mode of Action via Expression Profiling , 2003, Science.

[26]  Xingming Zhao,et al.  Conditional mutual inclusive information enables accurate quantification of associations in gene regulatory networks , 2014, Nucleic acids research.

[27]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[28]  Trupti Joshi,et al.  Inferring gene regulatory networks from multiple microarray datasets , 2006, Bioinform..

[29]  G. Palm,et al.  Predicting Variabilities in Cardiac Gene Expression with a Boolean Network Incorporating Uncertainty , 2015, PloS one.

[30]  Dario Floreano,et al.  Generating Realistic In Silico Gene Networks for Performance Assessment of Reverse Engineering Methods , 2009, J. Comput. Biol..

[31]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[32]  Kevin Skadron,et al.  Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[33]  Eduardo Sontag,et al.  Untangling the wires: A strategy to trace functional interactions in signaling and gene networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Fei Liu,et al.  Inference of Gene Regulatory Network Based on Local Bayesian Networks , 2016, PLoS Comput. Biol..

[35]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[36]  Jean-Philippe Vert,et al.  TIGRESS: Trustful Inference of Gene REgulation using Stability Selection , 2012, BMC Systems Biology.

[37]  Hana El-Samad,et al.  The Impact of Different Sources of Fluctuations on Mutual Information in Biochemical Networks , 2015, PLoS Comput. Biol..

[38]  Xing-Ming Zhao,et al.  NARROMI: a noise and redundancy reduction technique improves accuracy of gene regulatory network inference , 2013, Bioinform..

[39]  Staffan Persson,et al.  Co-expression tools for plant biology: opportunities for hypothesis generation and caveats. , 2009, Plant, cell & environment.

[40]  K. Aihara,et al.  Early Diagnosis of Complex Diseases by Molecular Biomarkers, Network Biomarkers, and Dynamical Network Biomarkers , 2014, Medicinal research reviews.

[41]  Carsten Peterson,et al.  Random Boolean network models and the yeast transcriptional network , 2003, Proceedings of the National Academy of Sciences of the United States of America.