PlasFlow: predicting plasmid sequences in metagenomic data using genome signatures

Abstract Plasmids are mobile genetics elements that play an important role in the environmental adaptation of microorganisms. Although plasmids are usually analyzed in cultured microorganisms, there is a need for methods that allow for the analysis of pools of plasmids (plasmidomes) in environmental samples. To that end, several molecular biology and bioinformatics methods have been developed; however, they are limited to environments with low diversity and cannot recover large plasmids. Here, we present PlasFlow, a novel tool based on genomic signatures that employs a neural network approach for identification of bacterial plasmid sequences in environmental samples. PlasFlow can recover plasmid sequences from assembled metagenomes without any prior knowledge of the taxonomical or functional composition of samples with an accuracy up to 96%. It can also recover sequences of both circular and linear plasmids and can perform initial taxonomical classification of sequences. Compared to other currently available tools, PlasFlow demonstrated significantly better performance on test datasets. Analysis of two samples from heavy metal-contaminated microbial mats revealed that plasmids may constitute an important fraction of their metagenomes and carry genes involved in heavy-metal homeostasis, proving the pivotal role of plasmids in microorganism adaptation to environmental conditions.

[1]  Christopher M Thomas,et al.  Mechanisms of, and Barriers to, Horizontal Gene Transfer between Bacteria , 2005, Nature Reviews Microbiology.

[2]  I. Saeed,et al.  Unsupervised discovery of microbial population structure within metagenomes using nucleotide base composition , 2011, Nucleic acids research.

[3]  D. Ussery,et al.  Relative entropy differences in bacterial chromosomes, plasmids, phages and genomic islands , 2012, BMC Genomics.

[4]  Eran Halperin,et al.  Recycler: an algorithm for detecting plasmids from de novo assembly graphs , 2016 .

[5]  Ö. Melefors,et al.  Plasmidome-Analysis of ESBL-Producing Escherichia coli Using Conventional Typing and High-Throughput Sequencing , 2013, PloS one.

[6]  P. Decewicz,et al.  Analysis of the Genome and Mobilome of a Dissimilatory Arsenate Reducing Aeromonas sp. O23A Reveals Multiple Mechanisms for Heavy Metal Resistance and Metabolism , 2017, Front. Microbiol..

[7]  Adi Doron-Faigenboim,et al.  Insights into the bovine rumen plasmidome , 2012, Proceedings of the National Academy of Sciences.

[8]  I. Mizrahi,et al.  A method for purifying high quality and high yield plasmid DNA for metagenomic and deep sequencing approaches. , 2013, Journal of microbiological methods.

[9]  K. Minamisawa,et al.  Bacterial clade with the ribosomal RNA operon on a small plasmid rather than the chromosome , 2015, Proceedings of the National Academy of Sciences.

[10]  Alexey A. Gurevich,et al.  QUAST: quality assessment tool for genome assemblies , 2013, Bioinform..

[11]  John C. Wooley,et al.  A Primer on Metagenomics , 2010, PLoS Comput. Biol..

[12]  B. Dreiseikelmann,et al.  A functional plasmid-borne rrn operon in soil isolates belonging to the genus Paracoccus. , 2003, Microbiology.

[13]  A. Goesmann,et al.  Insight into the plasmid metagenome of wastewater treatment plant bacteria showing reduced susceptibility to antimicrobial drugs analysed by the 454-pyrosequencing technology. , 2008, Journal of Biotechnology.

[14]  Eran Halperin,et al.  Recycler: an algorithm for detecting plasmids from de novo assembly graphs , 2016, bioRxiv.

[15]  Patricia Siguier,et al.  Insertion sequences in prokaryotic genomes. , 2006, Current opinion in microbiology.

[16]  J. Marchesi,et al.  Comparative metagenomic analysis of plasmid encoded functions in the human gut microbiome , 2010, BMC Genomics.

[17]  Yoseph Barash,et al.  Integrative deep models for alternative splicing , 2017, bioRxiv.

[18]  Sergio Arredondo-Alonso,et al.  On the (im)possibility of reconstructing plasmids from whole-genome short-read sequencing data , 2017, bioRxiv.

[19]  John Vollmers,et al.  Comparing and Evaluating Metagenome Assembly Tools from a Microbiologist’s Perspective - Not Only Size Matters! , 2017, PloS one.

[20]  A. Skłodowska,et al.  Physiological and Metagenomic Analyses of Microbial Mats Involved in Self-Purification of Mine Waters Contaminated with Heavy Metals , 2016, Front. Microbiol..

[21]  H. Heuer,et al.  Patchy distribution of flexible genetic elements in bacterial populations mediates robustness to environmental uncertainty. , 2008, FEMS Microbiology Ecology.

[22]  J. Ramos,et al.  Plasmid-Mediated Tolerance Toward Environmental Pollutants. , 2014, Microbiology spectrum.

[23]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[24]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[25]  Brian C. Thomas,et al.  Community-wide analysis of microbial genome sequence signatures , 2009, Genome Biology.

[26]  Anna Goldenberg,et al.  TensorFlow: Biology's Gateway to Deep Learning? , 2016, Cell systems.

[27]  H. Heuer,et al.  Plasmids foster diversification and adaptation of bacterial populations in soil. , 2012, FEMS microbiology reviews.

[28]  H. Saluz,et al.  The plasmidome of a Salmonella enterica serovar Derby isolated from pork meat. , 2013, Plasmid.

[29]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[30]  E. Kristiansson,et al.  Pyrosequencing of Antibiotic-Contaminated River Sediments Reveals High Levels of Resistance and Gene Transfer Elements , 2011, PloS one.

[31]  Eva M. Top,et al.  Using Mahalanobis distance to compare genomic signatures between bacterial plasmids and chromosomes , 2008, Nucleic acids research.

[32]  Derrick E. Wood,et al.  Kraken: ultrafast metagenomic sequence classification using exact alignments , 2014, Genome Biology.

[33]  Ying Xu,et al.  Barcodes for genomes and applications , 2008, BMC Bioinformatics.

[34]  J. Marchesi,et al.  Transposon-aided capture (TRACA) of plasmids resident in the human gut mobile metagenome , 2007, Nature Methods.

[35]  Sergey I. Nikolenko,et al.  SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing , 2012, J. Comput. Biol..

[36]  D. Willner,et al.  Metagenomic signatures of 86 microbial and viral metagenomes. , 2009, Environmental microbiology.

[37]  N. Leblond-Bourget,et al.  Conjugative and mobilizable genomic islands in bacteria: evolution and diversity. , 2014, FEMS microbiology reviews.

[38]  Masaki Shintani,et al.  Genomics of microbial plasmids: classification and identification based on replication and transfer systems and host taxonomy , 2015, Front. Microbiol..

[39]  S. Sørensen,et al.  Hundreds of Circular Novel Plasmids and DNA Elements Identified in a Rat Cecum Metamobilome , 2014, PloS one.

[40]  J. Fry,et al.  Isolation and screening of plasmids from the epilithon which mobilize recombinant plasmid pD10 , 1992, Applied and environmental microbiology.

[41]  Fernando de la Cruz,et al.  Plasmid Flux in Escherichia coli ST131 Sublineages, Analyzed by Plasmid Constellation Network (PLACNET), a New Method for Plasmid Reconstruction from Whole Genome Sequences , 2014, PLoS genetics.

[42]  Seunghak Lee,et al.  A network-driven approach for genome-wide association mapping , 2016, Bioinform..

[43]  Shigehiko Kanaya,et al.  Informatics for unveiling hidden genome signatures. , 2003, Genome research.

[44]  J. Fry,et al.  Novel method for studying plasmid transfer in undisturbed river epilithon , 1988, Applied and environmental microbiology.

[45]  Matteo Brilli,et al.  Exploring the evolutionary dynamics of plasmids: the Acinetobacter pan-plasmidome , 2010, BMC Evolutionary Biology.

[46]  Ryo Miyazaki,et al.  Community-wide plasmid gene mobilization and selection , 2012 .

[47]  T. Scheffer,et al.  Taxonomic metagenome sequence assignment with structured output models , 2011, Nature Methods.

[48]  Jon Bohlin,et al.  Reliability and applications of statistical methods based on oligonucleotide frequencies in bacterial and archaeal genomes , 2008, BMC Genomics.

[49]  I. Rigoutsos,et al.  Accurate phylogenetic classification of variable-length DNA fragments , 2007, Nature Methods.

[50]  Alessandra Carattoli,et al.  Plasmids and the spread of resistance. , 2013, International journal of medical microbiology : IJMM.

[51]  O. Stegle,et al.  Deep learning for computational biology , 2016, Molecular systems biology.

[52]  Tong Zhang,et al.  Plasmid Metagenome Reveals High Levels of Antibiotic Resistance Genes and Mobile Genetic Elements in Activated Sludge , 2011, PloS one.

[53]  M. Kunnimalaiyaan,et al.  Analysis of the replicon region and identification of an rRNA operon on pBM400 of Bacillus megaterium QM B1551 , 2001, Molecular microbiology.

[54]  Marcel Martin Cutadapt removes adapter sequences from high-throughput sequencing reads , 2011 .

[55]  Ole Lund,et al.  In Silico Detection and Typing of Plasmids using PlasmidFinder and Plasmid Multilocus Sequence Typing , 2014, Antimicrobial Agents and Chemotherapy.

[56]  Angela C. M. Luyf,et al.  Compositional discordance between prokaryotic plasmids and host chromosomes , 2006, BMC Genomics.

[57]  David Ussery,et al.  Investigations of Oligonucleotide Usage Variance Within and Between Prokaryotes , 2008, PLoS Comput. Biol..

[58]  Fernando de la Cruz,et al.  Mobility of Plasmids , 2010, Microbiology and Molecular Biology Reviews.

[59]  J. Marchesi,et al.  Accessing the mobile metagenome of the human gut microbiota. , 2007, Molecular bioSystems.

[60]  A. Singer,et al.  Review of Antimicrobial Resistance in the Environment and Its Relevance to Environmental Regulators , 2016, Front. Microbiol..

[61]  J. Dib,et al.  Strategies and approaches in plasmidome studies—uncovering plasmid diversity disregarding of linear elements? , 2015, Front. Microbiol..

[62]  S. Djordjevic,et al.  Mobile elements, zoonotic pathogens and commensal bacteria: conduits for the delivery of resistance genes into humans, production animals and soil microbiota , 2013, Front. Microbiol..

[63]  Andreas Wilke,et al.  phylogenetic and functional analysis of metagenomes , 2022 .

[64]  H. Nishida Comparative Analyses of Base Compositions, DNA Sizes, and Dinucleotide Frequency Profiles in Archaeal and Bacterial Chromosomes and Plasmids , 2012, International journal of evolutionary biology.

[65]  Tong Zhang,et al.  Exploring antibiotic resistance genes and metal resistance genes in plasmid metagenomes from wastewater treatment plants , 2015, Front. Microbiol..

[66]  Ying Xu,et al.  cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data , 2010, Bioinform..

[67]  Dmitry Antipov,et al.  plasmidSPAdes: Assembling Plasmids from Whole Genome Sequencing Data , 2016, bioRxiv.

[68]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.