NARROMI: a noise and redundancy reduction technique improves accuracy of gene regulatory network inference

MOTIVATION Reconstruction of gene regulatory networks (GRNs) is of utmost interest to biologists and is vital for understanding the complex regulatory mechanisms within the cell. Despite various methods developed for reconstruction of GRNs from gene expression profiles, they are notorious for high false positive rate owing to the noise inherited in the data, especially for the dataset with a large number of genes but a small number of samples. RESULTS In this work, we present a novel method, namely NARROMI, to improve the accuracy of GRN inference by combining ordinary differential equation-based recursive optimization (RO) and information theory-based mutual information (MI). In the proposed algorithm, the noisy regulations with low pairwise correlations are first removed by using MI, and the redundant regulations from indirect regulators are further excluded by RO to improve the accuracy of inferred GRNs. In particular, the RO step can help to determine regulatory directions without prior knowledge of regulators. The results on benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge and experimentally determined GRN of Escherichia coli show that NARROMI significantly outperforms other popular methods in terms of false positive rates and accuracy. AVAILABILITY All the source data and code are available at: http://csb.shu.edu.cn/narromi.htm.

[1]  Julio Collado-Vides,et al.  RegulonDB version 7.0: transcriptional regulation of Escherichia coli K-12 integrated within genetic sensory response units (Gensor Units) , 2010, Nucleic Acids Res..

[2]  Gianluca Bontempi,et al.  minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information , 2008, BMC Bioinformatics.

[3]  Adam A. Margolin,et al.  Reverse engineering cellular networks , 2006, Nature Protocols.

[4]  Riet De Smet,et al.  Advantages and limitations of current network inference methods , 2010, Nature Reviews Microbiology.

[5]  Luonan Chen,et al.  Modeling post-transcriptional regulation activity of small non-coding RNAs in Escherichia coli , 2009, BMC Bioinformatics.

[6]  Zheng Li,et al.  Large-scale dynamic gene regulatory network inference combining differential equation models with local dynamic Bayesian network analysis , 2011, Bioinform..

[7]  Mathisca C. M. de Gunst,et al.  Identification of context-specific gene regulatory networks with GEMULA - gene expression modeling using LAsso , 2012, Bioinform..

[8]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[9]  D. di Bernardo,et al.  Transcriptional gene network inference from a massive dataset elucidates transcriptome organization and gene function , 2011, Nucleic acids research.

[10]  D. di Bernardo,et al.  How to infer gene networks from expression profiles , 2007, Molecular systems biology.

[11]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[12]  陈亮 Multilevel support vector regression analysis to identify condition-specific regulatory networks , 2010 .

[13]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[14]  Chiara Sabatti,et al.  Network component analysis: Reconstruction of regulatory signals in biological systems , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Michael Q. Zhang Inferring Gene Regulatory Networks , 2008 .

[16]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[17]  Benjamin E Dunmore,et al.  Gene network inference and visualization tools for biologists: application to new human transcriptome datasets , 2011, Nucleic acids research.

[18]  Dario Floreano,et al.  GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods , 2011, Bioinform..

[19]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[20]  J. Collins,et al.  Chemogenomic profiling on a genome-wide scale using reverse-engineered gene networks , 2005, Nature Biotechnology.

[21]  J. Collins,et al.  Inferring Genetic Networks and Identifying Compound Mode of Action via Expression Profiling , 2003, Science.

[22]  Katsuhisa Horimoto,et al.  Discovery of Chemical Compound Groups with Common Structures by a Network Analysis Approach (Affinity Prediction Method) , 2011, J. Chem. Inf. Model..

[23]  Alberto de la Fuente,et al.  Discovery of meaningful associations in genomic data using partial correlation coefficients , 2004, Bioinform..

[24]  Olga G. Troyanskaya,et al.  Nested effects models for high-dimensional phenotyping screens , 2007, ISMB/ECCB.

[25]  L. MacNeil,et al.  Gene regulatory networks and the role of robustness and stochasticity in the control of gene expression. , 2011, Genome research.

[26]  Trupti Joshi,et al.  Inferring gene regulatory networks from multiple microarray datasets , 2006, Bioinform..

[27]  Frank Emmert-Streib,et al.  Revealing differences in gene network inference algorithms on the network level by ensemble methods , 2010, Bioinform..

[28]  Jeremiah J. Faith,et al.  Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata , 2007, Nucleic Acids Res..

[29]  Jesper Tegnér,et al.  Reverse engineering gene networks using singular value decomposition and robust regression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Kevin E. Bassler,et al.  Robust Detection of Hierarchical Communities from Escherichia coli Gene Expression Data , 2012, PLoS Comput. Biol..

[31]  Adam A. Margolin,et al.  Reverse engineering of regulatory networks in human B cells , 2005, Nature Genetics.

[32]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[33]  Andreas Wagner,et al.  How to reconstruct a large genetic network from n gene perturbations in fewer than n2 easy steps , 2001, Bioinform..

[34]  D. Floreano,et al.  Revealing strengths and weaknesses of methods for gene network inference , 2010, Proceedings of the National Academy of Sciences.

[35]  Ning Sun,et al.  Bayesian error analysis model for reconstructing transcriptional regulatory networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[36]  Qing Nie,et al.  Incorporating Existing Network Information into Gene Network Inference , 2009, PloS one.

[37]  D. Bernardo,et al.  A Yeast Synthetic Network for In Vivo Assessment of Reverse-Engineering and Modeling Approaches , 2009, Cell.

[38]  Antti Honkela,et al.  Model-based method for transcription factor target identification with limited data , 2010, Proceedings of the National Academy of Sciences.

[39]  Roger E Bumgarner,et al.  Construction of regulatory networks using expression time-series data of a genotyped population , 2011, Proceedings of the National Academy of Sciences.

[40]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[41]  Pere Caminal,et al.  MISS: a non-linear methodology based on mutual information for genetic association studies in both population and sib-pairs analysis , 2010, Bioinform..

[42]  Ralf Zimmer,et al.  Inferring gene regulatory networks by ANOVA , 2012, Bioinform..

[43]  Xuerui Yang,et al.  An Extensive MicroRNA-Mediated Network of RNA-RNA Interactions Regulates Established Oncogenic Pathways in Glioblastoma , 2011, Cell.

[44]  Diogo M. Camacho,et al.  Functional characterization of bacterial sRNAs using a network biology approach , 2011, Proceedings of the National Academy of Sciences.

[45]  S. Frenzel,et al.  Partial mutual information for coupling analysis of multivariate time series. , 2007, Physical review letters.