Where did you come from, where did you go: Refining metagenomic analysis tools for horizontal gene transfer characterisation

Horizontal gene transfer (HGT) has changed the way we regard evolution. Instead of waiting for the next generation to establish new traits, especially bacteria are able to take a shortcut via HGT that enables them to pass on genes from one individual to another, even across species boundaries. The tool Daisy offers the first HGT detection approach based on read mapping that provides complementary evidence compared to existing methods. However, Daisy relies on the acceptor and donor organism involved in the HGT being known. We introduce DaisyGPS, a mapping-based pipeline that is able to identify acceptor and donor reference candidates of an HGT event based on sequencing reads. Acceptor and donor identification is akin to species identification in metagenomic samples based on sequencing reads, a problem addressed by metagenomic profiling tools. However, acceptor and donor references have certain properties such that these methods cannot be directly applied. DaisyGPS uses MicrobeGPS, a metagenomic profiling tool tailored towards estimating the genomic distance between organisms in the sample and the reference database. We enhance the underlying scoring system of MicrobeGPS to account for the sequence patterns in terms of mapping coverage of an acceptor and donor involved in an HGT event, and report a ranked list of reference candidates. These candidates can then be further evaluated by tools like Daisy to establish HGT regions. We successfully validated our approach on both simulated and real data, and show its benefits in an investigation of an outbreak involving Methicillin-resistant Staphylococcus aureus data.

[1]  Sven Rahmann,et al.  Snakemake--a scalable bioinformatics workflow engine. , 2012, Bioinformatics.

[2]  Vincent Daubin,et al.  Examining bacterial species under the specter of gene transfer and exchange , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Bernhard Y. Renard,et al.  Metagenomic Profiling of Known and Unknown Microbes with MicrobeGPS , 2015, PloS one.

[4]  Knut Reinert,et al.  Fast and accurate read mapping with approximate seeds and multiple backtracking , 2012, Nucleic acids research.

[5]  Gerard D. Wright,et al.  The antibiotic resistome: what's new? , 2014, Current opinion in microbiology.

[6]  Laura R. Jarboe,et al.  Optical mapping and sequencing of the Escherichia coli KO11 genome reveal extensive chromosomal rearrangements, and multiple tandem copies of the Zymomonas mobilispdc and adhB genes , 2012, Journal of Industrial Microbiology & Biotechnology.

[7]  C. Gyles,et al.  Horizontally Transferred Genetic Elements and Their Role in Pathogenesis of Bacterial Disease , 2014, Veterinary pathology.

[8]  Bernhard Y. Renard,et al.  MetaMeta: integrating metagenome analysis tools to improve taxonomic profiling , 2017, bioRxiv.

[9]  J. Lindsay Staphylococcus aureus genomics and the impact of horizontal gene transfer. , 2014, International journal of medical microbiology : IJMM.

[10]  T. Thomas,et al.  Deep sequencing of evolving pathogen populations: applications, errors, and bioinformatic solutions , 2014, Microbial Informatics and Experimentation.

[11]  Eduardo N. Taboada,et al.  Genome evolution in major Escherichia coli O157:H7 lineages , 2007, BMC Genomics.

[12]  Ole Lund,et al.  Benchmarking of methods for identification of antimicrobial resistance genes in bacterial whole genome data. , 2016, The Journal of antimicrobial chemotherapy.

[13]  Miriam Barlow,et al.  What antimicrobial resistance has taught us about horizontal gene transfer. , 2009, Methods in molecular biology.

[14]  Vincent Daubin,et al.  Horizontal Gene Transfer and the History of Life. , 2016, Cold Spring Harbor perspectives in biology.

[15]  M. Juhas Horizontal gene transfer in human pathogens , 2015, Critical reviews in microbiology.

[16]  Robert C. Edgar,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2001 .

[17]  Philip D. Blood,et al.  Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software , 2017, Nature Methods.

[18]  Florian P Breitwieser,et al.  A review of methods and databases for metagenomic classification and assembly , 2019, Briefings Bioinform..

[19]  Páll Melsted,et al.  PopIns: population-scale detection of novel sequence insertions , 2015, Bioinform..

[20]  Christophe Dessimoz,et al.  Inferring Horizontal Gene Transfer , 2015, PLoS Comput. Biol..

[21]  Bjarni V. Halldórsson,et al.  Diversity in non-repetitive human sequences not found in the reference genome , 2017, Nature Genetics.

[22]  Bernhard Y. Renard,et al.  Analyzing genome coverage profiles with applications to quality control in metagenomics , 2013, Bioinform..

[23]  Gary Benson,et al.  Clinical PathoScope: rapid alignment and filtration for accurate pathogen identification in clinical samples using unassembled sequencing data , 2014, BMC Bioinformatics.

[24]  J. Lindsay Genomic variation and evolution of Staphylococcus aureus. , 2010, International journal of medical microbiology : IJMM.

[25]  Natália Martínková,et al.  SigHunt: horizontal gene transfer finder optimized for eukaryotic genomes , 2014, Bioinform..

[26]  R. Xavier,et al.  Comprehensive analysis of chromosomal mobile genetic elements in the gut microbiome reveals phylum-level niche-adaptive gene pools , 2019, PloS one.

[27]  Chien-Chi Lo,et al.  Pathogen comparative genomics in the next-generation sequencing era: genome alignments, pangenomics and metagenomics. , 2011, Briefings in functional genomics.

[28]  Alexander F. Auch,et al.  MEGAN analysis of metagenomic data. , 2007, Genome research.

[29]  Bernhard Y. Renard,et al.  Detecting Horizontal Gene Transfer by Mapping Sequencing Reads Across Species Boundaries , 2016, bioRxiv.

[30]  L. Boto Horizontal gene transfer in evolution: facts and challenges , 2010, Proceedings of the Royal Society B: Biological Sciences.

[31]  Paul P. Gardner,et al.  An evaluation of the accuracy and speed of metagenome analysis tools , 2015 .

[32]  Wen J. Li,et al.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation , 2015, Nucleic Acids Res..

[33]  Andreas Andrusch,et al.  DREAM-Yara: An exact read mapper for very large databases with short update time , 2018, bioRxiv.

[34]  C. William Keevil,et al.  Horizontal Transfer of Antibiotic Resistance Genes on Abiotic Touch Surfaces: Implications for Public Health , 2012, mBio.

[35]  D. Ebert,et al.  The End of a 60-year Riddle: Identification and Genomic Characterization of an Iridovirus, the Causative Agent of White Fat Cell Disease in Zooplankton , 2018, G3: Genes, Genomes, Genetics.

[36]  Derrick E. Wood,et al.  Kraken: ultrafast metagenomic sequence classification using exact alignments , 2014, Genome Biology.

[37]  Julian Parkhill,et al.  Rapid whole-genome sequencing for investigation of a neonatal MRSA outbreak. , 2012, The New England journal of medicine.

[38]  O. Kalinina,et al.  Detection of atypical genes in virus families using a one-class SVM , 2014, BMC Genomics.

[39]  Janice K. Wiedenbeck,et al.  Origins of bacterial diversity through horizontal genetic transfer and adaptation to new ecological niches. , 2011, FEMS microbiology reviews.

[40]  Bernhard Y. Renard,et al.  DUDes: a top-down taxonomic profiler for metagenomics , 2016, Bioinform..

[41]  K. Kupkova,et al.  Bioinformatics strategies for taxonomy independent binning and visualization of sequences in shotgun metagenomics , 2016, Computational and structural biotechnology journal.