Integration of heterogeneous molecular networks to unravel gene-regulation in Mycobacterium tuberculosis

BackgroundDifferent methods have been developed to infer regulatory networks from heterogeneous omics datasets and to construct co-expression networks. Each algorithm produces different networks and efforts have been devoted to automatically integrate them into consensus sets. However each separate set has an intrinsic value that is diluted and partly lost when building a consensus network. Here we present a methodology to generate co-expression networks and, instead of a consensus network, we propose an integration framework where the different networks are kept and analysed with additional tools to efficiently combine the information extracted from each network.ResultsWe developed a workflow to efficiently analyse information generated by different inference and prediction methods. Our methodology relies on providing the user the means to simultaneously visualise and analyse the coexisting networks generated by different algorithms, heterogeneous datasets, and a suite of analysis tools. As a show case, we have analysed the gene co-expression networks of Mycobacterium tuberculosis generated using over 600 expression experiments. Regarding DNA damage repair, we identified SigC as a key control element, 12 new targets for LexA, an updated LexA binding motif, and a potential mismatch repair system. We expanded the DevR regulon with 27 genes while identifying 9 targets wrongly assigned to this regulon. We discovered 10 new genes linked to zinc uptake and a new regulatory mechanism for ZuR. The use of co-expression networks to perform system level analysis allows the development of custom made methodologies. As show cases we implemented a pipeline to integrate ChIP-seq data and another method to uncover multiple regulatory layers.ConclusionsOur workflow is based on representing the multiple types of information as network representations and presenting these networks in a synchronous framework that allows their simultaneous visualization while keeping specific associations from the different networks. By simultaneously exploring these networks and metadata, we gained insights into regulatory mechanisms in M. tuberculosis that could not be obtained through the separate analysis of each data type.

[1]  Alejandro A. Schäffer,et al.  Database indexing for production MegaBLAST searches , 2008, Bioinform..

[2]  Stefan Engelen,et al.  MicroScope—an integrated microbial resource for the curation and comparative analysis of genomic and metabolic data , 2012, Nucleic Acids Res..

[3]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[4]  Carsten O. Daub,et al.  Estimating mutual information using B-spline functions – an improved similarity measure for analysing gene expression data , 2004, BMC Bioinformatics.

[5]  G. Schoolnik,et al.  The DosR regulon of M. tuberculosis and antibacterial tolerance. , 2009, Tuberculosis.

[6]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[7]  D. Bernardo,et al.  A Yeast Synthetic Network for In Vivo Assessment of Reverse-Engineering and Modeling Approaches , 2009, Cell.

[8]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[9]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[10]  C. Shelton,et al.  Annotating Genes of Known and Unknown Function by Large-Scale Coexpression Analysis1[W][OA] , 2008, Plant Physiology.

[11]  M. Madiraju,et al.  Interference of Mycobacterium tuberculosis cell division by Rv2719c, a cell wall hydrolase , 2006, Molecular microbiology.

[12]  R. Kaiser,et al.  Antiretroviral Therapy Optimisation without Genotype Resistance Testing: A Perspective on Treatment History Based Models , 2010, PloS one.

[13]  M. Voskuil,et al.  Unique Roles of DosT and DosS in DosR Regulon Induction and Mycobacterium tuberculosis Dormancy , 2009, Infection and Immunity.

[14]  T. Myers,et al.  The Transcriptional Responses of Mycobacterium tuberculosis to Inhibitors of Metabolism , 2004, Journal of Biological Chemistry.

[15]  W. Wong,et al.  The analysis of ChIP-Seq data. , 2011, Methods in enzymology.

[16]  Diogo F. Veiga,et al.  Network inference and network response identification: moving genome-scale data to the next level of biological discovery. , 2010, Molecular bioSystems.

[17]  Damian Szklarczyk,et al.  The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored , 2010, Nucleic Acids Res..

[18]  B. Gicquel,et al.  DNA repair in Mycobacterium tuberculosis revisited. , 2009, FEMS microbiology reviews.

[19]  Mohammad Asim,et al.  Differential C3NET reveals disease networks of direct physical interactions , 2011, BMC Bioinformatics.

[20]  G. Kaplan,et al.  Function and Regulation of Class I Ribonucleotide Reductase-Encoding Genes in Mycobacteria , 2008, Journal of bacteriology.

[21]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[22]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[23]  Dennis B. Troup,et al.  NCBI GEO: archive for functional genomics data sets—10 years on , 2010, Nucleic Acids Res..

[24]  P. Roback,et al.  A predicted operon map for Mycobacterium tuberculosis , 2007, Nucleic acids research.

[25]  David J. Reiss,et al.  Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks , 2006, BMC Bioinformatics.

[26]  Gábor Balázsi,et al.  The temporal response of the Mycobacterium tuberculosis gene regulatory network during growth arrest , 2008, Molecular systems biology.

[27]  S. Busby,et al.  The bacterial LexA transcriptional repressor , 2008, Cellular and Molecular Life Sciences.

[28]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[29]  C. Daub,et al.  BMC Systems Biology , 2007 .

[30]  Kimberley M. Smith,et al.  Global Analysis of the Regulon of the Transcriptional Repressor LexA, a Key Component of SOS Response in Mycobacterium tuberculosis , 2012, The Journal of Biological Chemistry.

[31]  Andrew C. Stewart,et al.  DIYA: a bacterial annotation pipeline for any genomics lab , 2009, Bioinform..

[32]  Frank Emmert-Streib,et al.  Influence of the experimental design of gene expression studies on the inference of gene regulatory networks: environmental factors , 2013, PeerJ.

[33]  Duane Szafron,et al.  BASys: a web server for automated bacterial genome annotation , 2005, Nucleic Acids Res..

[34]  Peer Bork,et al.  Annotation of the M. tuberculosis Hypothetical Orfeome: Adding Functional Information to More than Half of the Uncharacterized Proteins , 2012, PloS one.

[35]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[36]  A. Arkin,et al.  Comparative Genomics of the Dormancy Regulons in Mycobacteria ᰔ † , 2011 .

[37]  W. Bishai,et al.  Mycobacterium smegmatis whmD and its homologue Mycobacterium tuberculosis whiB2 are functionally equivalent. , 2006, Microbiology.

[38]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[39]  Rachael P. Huntley,et al.  The UniProt-GO Annotation database in 2011 , 2011, Nucleic Acids Res..

[40]  Jaya Sivaswami Tyagi,et al.  Comprehensive insights into Mycobacterium tuberculosis DevR (DosR) regulon activation switch , 2011, Nucleic acids research.

[41]  Gianluca Bontempi,et al.  minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information , 2008, BMC Bioinformatics.

[42]  Oliver Fleck,et al.  DNA mismatch repair and mutation avoidance pathways , 2002, Journal of cellular physiology.

[43]  William R Bishai,et al.  Altered expression of isoniazid-regulated genes in drug-treated dormant Mycobacterium tuberculosis. , 2008, The Journal of antimicrobial chemotherapy.

[44]  M. P. Tan,et al.  Nitrate Respiration Protects Hypoxic Mycobacterium tuberculosis Against Acid- and Reactive Nitrogen Species Stresses , 2010, PloS one.

[45]  Richard Bonneau,et al.  The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo , 2006, Genome Biology.

[46]  Yi Wang,et al.  ClpR Protein-like Regulator Specifically Recognizes RecA Protein-independent Promoter Motif and Broadly Regulates Expression of DNA Damage-inducible Genes in Mycobacteria* , 2011, The Journal of Biological Chemistry.

[47]  Jacques van Helden,et al.  RSAT: regulatory sequence analysis tools , 2008, Nucleic Acids Res..

[48]  I. Smith,et al.  Global Analysis of the Mycobacterium tuberculosis Zur (FurB) Regulon , 2006, Journal of bacteriology.

[49]  S. Horvath,et al.  Statistical Applications in Genetics and Molecular Biology , 2011 .

[50]  Adamandia Kapopoulou,et al.  TubercuList--10 years after. , 2011, Tuberculosis.

[51]  Peter D. Karp,et al.  The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases , 2007, Nucleic Acids Res..

[52]  Yves Van de Peer,et al.  The Mycobacterium tuberculosis regulatory network and hypoxia , 2013, Nature.

[53]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[54]  Rafael A. Irizarry,et al.  Bioinformatics and Computational Biology Solutions using R and Bioconductor , 2005 .

[55]  Gustavo Stolovitzky,et al.  Lessons from the DREAM2 Challenges , 2009, Annals of the New York Academy of Sciences.

[56]  Yamir Moreno,et al.  The Transcriptional Regulatory Network of Mycobacterium tuberculosis , 2011, PloS one.

[57]  A. Tyagi,et al.  The sigma factors of Mycobacterium tuberculosis: regulation of the regulators , 2010, The FEBS journal.

[58]  D. di Bernardo,et al.  Transcriptional gene network inference from a massive dataset elucidates transcriptome organization and gene function , 2011, Nucleic acids research.

[59]  E. Davis,et al.  SigG Does Not Control Gene Expression in Response to DNA Damage in Mycobacterium tuberculosis H37Rv , 2010, Journal of bacteriology.

[60]  Martin Tompa,et al.  Rv3133c/dosR is a transcription factor that mediates the hypoxic response of Mycobacterium tuberculosis , 2003, Molecular microbiology.

[61]  Olga G. Troyanskaya,et al.  The Sleipnir library for computational functional genomics , 2008, Bioinform..

[62]  Č. Venclovas,et al.  Essential roles for imuA′- and imuB-encoded accessory factors in DnaE2-dependent mutagenesis in Mycobacterium tuberculosis , 2010, Proceedings of the National Academy of Sciences.

[63]  Susumu Goto,et al.  KEGG for integration and interpretation of large-scale molecular data sets , 2011, Nucleic Acids Res..

[64]  Riet De Smet,et al.  Advantages and limitations of current network inference methods , 2010, Nature Reviews Microbiology.

[65]  B. Abomoelak,et al.  A Novel In Vitro Multiple-Stress Dormancy Model for Mycobacterium tuberculosis Generates a Lipid-Loaded, Drug-Tolerant, Dormant Pathogen , 2009, PloS one.

[66]  Christian Stolte,et al.  TB database: an integrated platform for tuberculosis research , 2008, Nucleic Acids Res..

[67]  Gary K. Schoolnik,et al.  ideR, an Essential Gene in Mycobacterium tuberculosis: Role of IdeR in Iron-Dependent Gene Expression, Iron Metabolism, and Oxidative Stress Response , 2002, Infection and Immunity.

[68]  Subha Madhavan,et al.  DDN: a caBIG® analytical tool for differential network analysis , 2011, Bioinform..

[69]  Tomasz Arodz,et al.  ENNET: inferring large gene regulatory networks from expression data using gradient boosting , 2013, BMC Systems Biology.

[70]  D. di Bernardo,et al.  How to infer gene networks from expression profiles , 2007, Molecular systems biology.

[71]  Andrea Califano,et al.  hARACNe: improving the accuracy of regulatory model reverse engineering via higher-order data processing inequality tests , 2013, Interface Focus.

[72]  S. Fortune,et al.  Mycobacterial Esx-3 is required for mycobactin-mediated iron acquisition , 2009, Proceedings of the National Academy of Sciences.

[73]  Paul P. Wang,et al.  Advances to Bayesian network inference for generating causal networks from observational biological data , 2004, Bioinform..

[74]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[75]  Adam A. Margolin,et al.  Reverse engineering of regulatory networks in human B cells , 2005, Nature Genetics.

[76]  J. Davis Bioinformatics and Computational Biology Solutions Using R and Bioconductor , 2007 .

[77]  Frank Emmert-Streib,et al.  Inferring the conservative causal core of gene regulatory networks , 2010, BMC Systems Biology.

[78]  Susmita Datta,et al.  A statistical framework for differential network analysis from microarray data , 2010, BMC Bioinformatics.

[79]  Hanbo Chen,et al.  VennDiagram: a package for the generation of highly-customizable Venn and Euler diagrams in R , 2011, BMC Bioinformatics.

[80]  E. Rubin,et al.  Letting sleeping dos lie: does dormancy play a role in tuberculosis? , 2010, Annual review of microbiology.

[81]  I. Ahel,et al.  Identification of a promoter motif regulating the major DNA damage response mechanism of Mycobacterium tuberculosis. , 2004, FEMS microbiology letters.

[82]  Andy M. Yip,et al.  Gene network interconnectedness and the generalized topological overlap measure , 2007, BMC Bioinformatics.

[83]  Matteo Pellegrini,et al.  Prolinks: a database of protein functional linkages derived from coevolution , 2004, Genome Biology.

[84]  Trey Ideker,et al.  Cytoscape 2.8: new features for data integration and network visualization , 2010, Bioinform..

[85]  T. Ideker,et al.  Differential network biology , 2012, Molecular systems biology.

[86]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.