BINDER: computationally inferring a gene regulatory network for Mycobacterium abscessus

BackgroundAlthough many of the genic features in Mycobacterium abscessus have been fully validated, a comprehensive understanding of the regulatory elements remains lacking. Moreover, there is little understanding of how the organism regulates its transcriptomic profile, enabling cells to survive in hostile environments. Here, to computationally infer the gene regulatory network for Mycobacterium abscessus we propose a novel statistical computational modelling approach: BayesIan gene regulatory Networks inferreD via gene coExpression and compaRative genomics (BINDER). In tandem with derived experimental coexpression data, the property of genomic conservation is exploited to probabilistically infer a gene regulatory network in Mycobacterium abscessus.Inference on regulatory interactions is conducted by combining ‘primary’ and ‘auxiliary’ data strata. The data forming the primary and auxiliary strata are derived from RNA-seq experiments and sequence information in the primary organism Mycobacterium abscessus as well as ChIP-seq data extracted from a related proxy organism Mycobacterium tuberculosis. The primary and auxiliary data are combined in a hierarchical Bayesian framework, informing the apposite bivariate likelihood function and prior distributions respectively. The inferred relationships provide insight to regulon groupings in Mycobacterium abscessus.ResultsWe implement BINDER on data relating to a collection of 167,280 regulator-target pairs resulting in the identification of 54 regulator-target pairs, across 5 transcription factors, for which there is strong probability of regulatory interaction.ConclusionsThe inferred regulatory interactions provide insight to, and a valuable resource for further studies of, transcriptional control in Mycobacterium abscessus, and in the family of Mycobacteriaceae more generally. Further, the developed BINDER framework has broad applicability, useable in settings where computational inference of a gene regulatory network requires integration of data sources derived from both the primary organism of interest and from related proxy organisms.

[1]  V. de Lorenzo Pseudomonas aeruginosa: the making of a pathogen. , 2015, Environmental microbiology.

[2]  J. Collado-Vides,et al.  Effect of genomic distance on coexpression of coregulated genes in E. coli , 2017, PloS one.

[3]  Joseph C. Pearson,et al.  Transcriptional autoregulation in development , 2009, Current Biology.

[4]  E. Koonin Orthologs, Paralogs, and Evolutionary Genomics 1 , 2005 .

[5]  Isobel Claire Gormley,et al.  A dynamic probabilistic principal components model for the analysis of longitudinal metabolomics data , 2013, 1312.2393.

[6]  S. Busby,et al.  The bacterial LexA transcriptional repressor , 2008, Cellular and Molecular Life Sciences.

[7]  B. Schwikowski,et al.  Condition-Dependent Transcriptome Reveals High-Level Regulatory Architecture in Bacillus subtilis , 2012, Science.

[8]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[9]  Bin Zhang,et al.  Multiscale Embedded Gene Co-expression Network Analysis , 2015, PLoS Comput. Biol..

[10]  Keun Ho Ryu,et al.  Comparing the normalization methods for the differential analysis of Illumina high-throughput RNA-Seq data , 2015, BMC Bioinformatics.

[11]  Haiyan Huang,et al.  Review on statistical methods for gene network reconstruction using expression data. , 2014, Journal of theoretical biology.

[12]  Richard Bonneau,et al.  Multi-species integrative biclustering , 2010, Genome Biology.

[13]  S. Swann,et al.  Role of cysteine residues in pseudouridine synthases of different families. , 1999, Biochemistry.

[14]  George Kollias,et al.  Inferring active regulatory networks from gene expression data using a combination of prior knowledge and enrichment analysis , 2016, BMC Bioinformatics.

[15]  J. Fay,et al.  Identification of functional transcription factor binding sites using closely related Saccharomyces species. , 2005, Genome research.

[16]  B. Gicquel,et al.  Mycobacterium abscessus: a new antibiotic nightmare. , 2012, The Journal of antimicrobial chemotherapy.

[17]  Anna Lyubetskaya,et al.  Transcription Factor Binding Site Mapping Using ChIP-Seq. , 2014, Microbiology spectrum.

[18]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[19]  Michael Banf,et al.  Enhancing gene regulatory network inference through data integration with markov random fields , 2017, Scientific Reports.

[20]  D. Scanlan,et al.  Bacterial zinc uptake regulator proteins and their regulons , 2018, Biochemical Society transactions.

[21]  C. Peano,et al.  Characterization of the Escherichia coli σS core regulon by Chromatin Immunoprecipitation-sequencing (ChIP-seq) analysis , 2015, Scientific Reports.

[22]  E. Koonin Orthologs, paralogs, and evolutionary genomics. , 2005, Annual review of genetics.

[23]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[24]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[25]  R. Pejchal,et al.  Cobalamin-Independent Methionine Synthase (MetE): A Face-to-Face Double Barrel That Evolved by Gene Duplication , 2004, PLoS biology.

[26]  M. Butala,et al.  The Use and Abuse of LexA by Mobile Genetic Elements. , 2016, Trends in microbiology.

[27]  Julio Collado-Vides,et al.  RegulonDB v 10.5: tackling challenges to unify classic and high throughput knowledge of gene regulation in E. coli K-12 , 2018, Nucleic Acids Res..

[28]  Claudia Angelini,et al.  Understanding gene regulatory mechanisms by integrating ChIP-seq and RNA-seq data: statistical solutions to biological problems , 2014, Front. Cell Dev. Biol..

[29]  Jörg Stülke,et al.  SubtiWiki in 2018: from genes and proteins to functional network annotation of the model organism Bacillus subtilis , 2017, Nucleic Acids Res..

[30]  Po-Ren Hsueh,et al.  Mycobacterium abscessus Complex Infections in Humans , 2015, Emerging infectious diseases.

[31]  D. Giedroc,et al.  Bacterial Strategies to Maintain Zinc Metallostasis at the Host-Pathogen Interface* , 2016, The Journal of Biological Chemistry.

[32]  Brendan J. Loftus,et al.  Functional characterization of the Mycobacterium abscessus genome coupled with condition specific transcriptomics reveals conserved molecular strategies for host adaptation and persistence , 2016, BMC Genomics.

[33]  K. Vandepoele,et al.  A Collection of Conserved Noncoding Sequences to Study Gene Regulation in Flowering Plants1[OPEN] , 2016, Plant Physiology.

[34]  B. Vallee,et al.  The biochemical basis of zinc physiology. , 1993, Physiological reviews.

[35]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[36]  J. Helmann,et al.  Molecular logic of the Zur-regulated zinc deprivation response in Bacillus subtilis , 2016, Nature Communications.

[37]  P. Agrawal,et al.  Studies on structural and functional divergence among seven WhiB proteins of Mycobacterium tuberculosis H37Rv , 2009, The FEBS journal.

[38]  Pei Wang,et al.  Integrative random forest for gene regulatory network inference , 2015, Bioinform..

[39]  Drew A. Linzer,et al.  poLCA: An R Package for Polytomous Variable Latent Class Analysis , 2011 .

[40]  K. Mason,et al.  Ferric Uptake Regulator and Its Role in the Pathogenesis of Nontypeable Haemophilus influenzae , 2013, Infection and Immunity.

[41]  I. Peiris Listeria monocytogenes, a Food-Borne Pathogen , 1991, Microbiological reviews.

[42]  Berend Snel,et al.  Gene co-regulation is highly conserved in the evolution of eukaryotes and prokaryotes. , 2004, Nucleic acids research.

[43]  A. Morby,et al.  Zn(II) metabolism in prokaryotes. , 2003, FEMS microbiology reviews.

[44]  P. Michalak Coexpression, coregulation, and cofunctionality of neighboring genes in eukaryotic genomes. , 2008, Genomics.

[45]  K. Vandepoele,et al.  Inference of Transcriptional Networks in Arabidopsis through Conserved Noncoding Sequence Analysis[C][W] , 2014, Plant Cell.

[46]  M. Kanehisa,et al.  Conservation of gene co-regulation between two prokaryotes: Bacillus subtilis and Escherichia coli. , 2005, Genome informatics. International Conference on Genome Informatics.

[47]  H. Hur,et al.  Environmental Escherichia coli: ecology and public health implications—a review , 2017, Journal of applied microbiology.

[48]  Tao Lu,et al.  Role of ChIP-seq in the discovery of transcription factor binding sites, differential gene regulation mechanism, epigenetic marks and beyond , 2014, Cell cycle.

[49]  S. Holban,et al.  A review of ensemble methods for de novo motif discovery in ChIP-Seq data , 2015, Briefings Bioinform..

[50]  Roberto Tagliaferri,et al.  CONDOP: an R package for CONdition-Dependent Operon Predictions , 2016, Bioinform..

[51]  M. Darmostuk,et al.  Current approaches in SELEX: An update to aptamer selection technology. , 2015, Biotechnology advances.

[52]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[53]  Michael Y. Galperin,et al.  The COG database: a tool for genome-scale analysis of protein functions and evolution , 2000, Nucleic Acids Res..

[54]  S. Bősze,et al.  Combating highly resistant emerging pathogen Mycobacterium abscessus and Mycobacterium tuberculosis with novel salicylanilide esters and carbamates. , 2015, European journal of medicinal chemistry.

[55]  Thomas Brendan Murphy,et al.  BayesLCA: An R Package for Bayesian Latent Class Analysis , 2014 .

[56]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[57]  Kathleen Marchal,et al.  COLOMBOS v2.0: an ever expanding collection of bacterial expression compendia , 2013, Nucleic Acids Res..

[58]  Ivan Erill,et al.  CollecTF: a database of experimentally validated transcription factor-binding sites in Bacteria , 2013, Nucleic Acids Res..

[59]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[60]  Christian J Stoeckert,et al.  Clustering of genes into regulons using integrated modeling-COGRIM , 2007, Genome Biology.

[61]  Matthew R. Laird,et al.  OrtholugeDB: a bacterial and archaeal orthology resource for improved comparative genomic analysis , 2012, Nucleic Acids Res..

[62]  N. Socci,et al.  Genome-wide mapping of the distribution of CarD, RNAP σA, and RNAP β on the Mycobacterium smegmatis chromosome using chromatin immunoprecipitation sequencing , 2014, Genomics data.

[63]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[64]  J. S. Rao,et al.  Spike and slab variable selection: Frequentist and Bayesian strategies , 2005, math/0505633.

[65]  M. Cugmas,et al.  On comparing partitions , 2015 .

[66]  V. Arluison,et al.  Transfer RNA-pseudouridine synthetase Pus1 of Saccharomyces cerevisiae contains one atom of zinc essential for its native conformation and tRNA recognition. , 1998, Biochemistry.

[67]  Xiao Wang,et al.  CRNET: an efficient sampling approach to infer functional regulatory networks by integrating large‐scale ChIP‐seq and time‐course RNA‐seq data , 2018, Bioinform..

[68]  Christopher C. Overall,et al.  ChIP-Seq Analysis of the σE Regulon of Salmonella enterica Serovar Typhimurium Reveals New Genes Implicated in Heat Shock and Oxidative Stress Response , 2015, PLoS ONE.

[69]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[70]  B. Lai,et al.  Virulence-related Mycobacterium avium subsp hominissuis MAV_2928 gene is associated with vacuole remodeling in macrophages , 2010, BMC Microbiology.

[71]  Lin Song,et al.  Comparison of co-expression measures: mutual information, correlation, and model based indices , 2012, BMC Bioinformatics.

[72]  Ashlee M Earl,et al.  Ecology and genomics of Bacillus subtilis. , 2008, Trends in microbiology.

[73]  S. Duane,et al.  Hybrid Monte Carlo , 1987 .

[74]  Yves Van de Peer,et al.  The Mycobacterium tuberculosis regulatory network and hypoxia , 2013, Nature.