Constructing a Boolean implication network to study the interactions between environmental factors and OTUs

AbstractMining relationships between microbes and the environment they live in are crucial to understand the intrinsic mechanisms that govern cycles of carbon, nitrogen and energy in a microbial community. Building upon next-generation sequencing technology, the selective capture of 16S rRNA genes has enabled the study of co-occurrence patterns of microbial species from the viewpoint of complex networks, yielding successful descriptions of phenomena exhibited in a microbial community. However, since the effects of such environmental factors as temperature or soil conditions on microbes are complex, reliance on the analysis of co-occurrence networks alone cannot elucidate such complicated effects underlying microbial communities. In this study, we apply a statistical method, which is called Boolean implications for metagenomic studies (BIMS) for extracting Boolean implications (IF-THEN relationships) to capture the effects of environmental factors on microbial species based on 16S rRNA sequencing data. We first demonstrate the power and effectiveness of BIMS through comprehensive simulation studies and then apply it to a 16S rRNA sequencing dataset of real marine microbes. Based on a total of 6,514 pairwise relationships identified at a low false discovery rate (FDR) of 0.01, we construct a Boolean implication network between operational taxonomic units (OTUs) and environmental factors. Relationships in this network are supported by literature, and, most importantly, they bring biological insights into the effects of environmental factors on microbes. We next apply BIMS to detect three-way relationships and show the possibility of using this strategy to explain more complex relationships within a microbial community.

[1]  Katharine N. Suding,et al.  Erratum: Cooccurrence patterns of plants and soil bacteria in the high-alpine subnival zone track environmental harshness , 2012, Front. Microbiol..

[2]  Adam Godzik,et al.  Clustering of highly homologous sequences to reduce the size of large protein databases , 2001, Bioinform..

[3]  S. Voget,et al.  Prospecting for Novel Biocatalysts in a Soil Metagenome , 2003, Applied and Environmental Microbiology.

[4]  Vicente Catalán,et al.  A Study of Air Microbe Levels in Different Areas of a Hospital , 2009, Current Microbiology.

[5]  K. Schleifer,et al.  Phylogenetic identification and in situ detection of individual microbial cells without cultivation. , 1995, Microbiological reviews.

[6]  D. Caron,et al.  Marine bacterial, archaeal and protistan association networks reveal ecological linkages , 2011, The ISME Journal.

[7]  Sharon L. Grim,et al.  Oligotyping: differentiating between closely related microbial taxa using 16S rRNA gene data , 2013, Methods in ecology and evolution.

[8]  J. Handelsman,et al.  Cloning the Soil Metagenome: a Strategy for Accessing the Genetic and Functional Diversity of Uncultured Microorganisms , 2000, Applied and Environmental Microbiology.

[9]  E. Delong,et al.  Widespread known and novel phosphonate utilization pathways in marine bacteria revealed by functional screening and metagenomic analyses. , 2010, Environmental microbiology.

[10]  C. Gobler,et al.  Rapid shifts in dominant taxa among microbial eukaryotes in estuarine ecosystems , 2009 .

[11]  J. Eisen,et al.  Assembling the Marine Metagenome, One Cell at a Time , 2009, PloS one.

[12]  M. Blaser,et al.  The human microbiome: at the interface of health and disease , 2012, Nature Reviews Genetics.

[13]  Emily K. Tsang,et al.  Mining TCGA Data Using Boolean Implications , 2014, PloS one.

[14]  Martin Hartmann,et al.  Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.

[15]  Robert Tibshirani,et al.  Boolean implication networks derived from large scale, whole genome microarray datasets , 2008, Genome Biology.

[16]  Qiang Feng,et al.  A metagenome-wide association study of gut microbiota in type 2 diabetes , 2012, Nature.

[17]  R. Solé,et al.  Ecological networks and their fragility , 2006, Nature.

[18]  I. Hewson,et al.  Annually reoccurring bacterial communities are predictable from ocean conditions , 2006, Proceedings of the National Academy of Sciences.

[19]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[20]  Ting Chen,et al.  Clustering 16S rRNA for OTU prediction: a method of unsupervised Bayesian clustering , 2011, Bioinform..

[21]  Hubert Rehrauer,et al.  A global network of coexisting microbes from environmental and whole-genome sequence data. , 2010, Genome research.

[22]  E. Delong,et al.  Characterization of uncultivated prokaryotes: isolation and analysis of a 40-kilobase-pair genome fragment from a planktonic marine archaeon , 1996, Journal of bacteriology.

[23]  Fidel Ramírez,et al.  Computing topological parameters of biological networks , 2008, Bioinform..

[24]  John C. Wooley,et al.  A Primer on Metagenomics , 2010, PLoS Comput. Biol..

[25]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[26]  Puneet Singh Chauhan,et al.  Medicinal smoke reduces airborne bacteria. , 2007, Journal of ethnopharmacology.

[27]  J. Handelsman,et al.  Introducing DOTUR, a Computer Program for Defining Operational Taxonomic Units and Estimating Species Richness , 2005, Applied and Environmental Microbiology.

[28]  Sen-Lin Tang,et al.  Marine Microbial Metagenomics: From Individual to the Environment , 2014, International journal of molecular sciences.

[29]  Susan M. Huse,et al.  Ironing out the wrinkles in the rare biosphere through improved OTU clustering , 2010, Environmental microbiology.

[30]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[31]  Francisco P. Chavez,et al.  Seasonal fluctuations of temperature, salinity, nitrate, chlorophyll and primary production at station H3/M1 over 1989-1996 in Monterey Bay, California , 2000 .

[32]  S. Giovannoni,et al.  The uncultured microbial majority. , 2003, Annual review of microbiology.

[33]  C. Manichanh,et al.  Reduced diversity of faecal microbiota in Crohn’s disease revealed by a metagenomic approach , 2005, Gut.

[34]  N. Pace A molecular view of microbial diversity and the biosphere. , 1997, Science.

[35]  Huzefa Rangwala,et al.  16S rRNA metagenome clustering and diversity estimation using locality sensitive hashing , 2013, BMC Systems Biology.

[36]  Debashis Sahoo,et al.  Extracting binary signals from microarray time-course data , 2007, Nucleic acids research.

[37]  Yongmei Cheng,et al.  A Comparison of Methods for Clustering 16S rRNA Sequences into OTUs , 2013, PloS one.