The Listeria monocytogenes Core-Genome Sequence Typer (LmCGST): a bioinformatic pipeline for molecular characterization with next-generation sequence data

BackgroundNext-generation sequencing provides a powerful means of molecular characterization. However, methods such as single-nucleotide polymorphism detection or whole-chromosome sequence analysis are computationally expensive, prone to errors, and are still less accessible than traditional typing methods. Here, we present the Listeria monocytogenes core-genome sequence typing method for molecular characterization. This method uses a high-confidence core (HCC) genome, calculated to ensure accurate identification of orthologs. We also developed an evolutionarily relevant nomenclature based upon phylogenetic analysis of HCC genomes. Finally, we created a pipeline (LmCGST; https://sourceforge.net/projects/lmcgst/files/) that takes in raw next-generation sequencing reads, calculates a subject HCC profile, compares it to an expandable database, assigns a sequence type, and performs a phylogenetic analysis.ResultsWe analyzed 29 high-quality, closed Listeria monocytogenes chromosome sequences and identified loci that are reliable targets for automated molecular characterization methods. We identified 1013 open-reading frames that comprise our high-confidence core (HCC) genome. We then populated a database with HCC profiles from 114 taxa. We sequenced 84 randomly selected isolates from the Listeriosis Reference Service for Canada’s collection and analysed them with the LmCGST pipeline. In addition, we generated pulsed-field gel electrophoresis, ribotyping, and in silico multi-locus sequence typing (MLST) data for the 84 isolates and compared the results to those obtained using the CGST method. We found that all of the methods yielded results that are generally congruent. However, due to the increased numbers of categories, the CGST method provides much greater discriminatory power than the other methods tested here.ConclusionsWe show that the CGST method provides increased discriminatory power relative to typing methods such as pulsed-field gel electrophoresis, ribotyping, and multi-locus sequence typing while it addresses several shortcomings of other methods of molecular characterization with next-generation sequence data. It uses discrete, well-defined groupings (types) of organisms that are phylogenetically relevant and easily interpreted. In addition, the CGST scheme can be expanded to include additional loci and HCC profiles in the future. In total, the CGST method provides an approach to the molecular characterization of Listeria monocytogenes with next-generation sequence data that is highly reproducible, easily standardized, portable, and accessible.

[1]  Justin S. Hogg,et al.  Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains , 2007, Genome Biology.

[2]  David A Rasko,et al.  Whole genome comparisons of serotype 4b and 1/2a strains of the food-borne pathogen Listeria monocytogenes reveal new insights into the core genome components of this species. , 2004, Nucleic acids research.

[3]  Mikhail Pachkov,et al.  Automated Reconstruction of Whole-Genome Phylogenies from Short-Sequence Reads , 2014, Molecular biology and evolution.

[4]  R. Rudner,et al.  Restriction Site Polymorphism of Ribosomal Ribonucleic Acid Gene Sets in Members of the Genus Bacillus , 1985 .

[5]  D. Schwartz,et al.  Separation of yeast chromosome-sized DNAs by pulsed field gradient gel electrophoresis , 1984, Cell.

[6]  M. Berriman,et al.  Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps , 2010, Genome Biology.

[7]  S. Bridges,et al.  Genome Sequence of Lineage III Listeria monocytogenes Strain HCC23 , 2011, Journal of bacteriology.

[8]  N. Loman,et al.  High-throughput bacterial genome sequencing: an embarrassment of choice, a world of opportunity , 2012, Nature Reviews Microbiology.

[9]  L. Ponnala,et al.  Deep RNA sequencing of L. monocytogenes reveals overlapping and extensive stationary phase and sigma B-dependent transcriptomes, including multiple highly transcribed noncoding RNAs , 2009, BMC Genomics.

[10]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[11]  Ying Shan,et al.  Genome Sequence of the Nonpathogenic Listeria monocytogenes Serovar 4a Strain M7 , 2011, Journal of bacteriology.

[12]  P. Cossart,et al.  Comparison of Widely Used Listeria monocytogenes Strains EGD, 10403S, and EGD-e Highlights Genomic Differences Underlying Variations in Pathogenicity , 2014, mBio.

[13]  Jaideep P. Sundaram,et al.  Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome". , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[14]  M. Stanhope,et al.  Evolution of the core and pan-genome of Streptococcus: positive selection, recombination, and genome composition , 2007, Genome Biology.

[15]  E. H. Simpson Measurement of Diversity , 1949, Nature.

[16]  B. Swaminathan,et al.  PulseNet standardized protocol for subtyping Listeria monocytogenes by macrorestriction and pulsed-field gel electrophoresis. , 2001, International journal of food microbiology.

[17]  Arthur W. Pightling,et al.  Choice of Reference Sequence and Assembler for Alignment of Listeria monocytogenes Short-Read Sequence Data Greatly Influences Rates of Error in SNP Analyses , 2014, PloS one.

[18]  João André Carriço,et al.  Adjusted Wallace Coefficient as a Measure of Congruence between Typing Methods , 2011, Journal of Clinical Microbiology.

[19]  S. Salzberg,et al.  Improved microbial gene identification with GLIMMER. , 1999, Nucleic acids research.

[20]  Eduardo N. Taboada,et al.  A Framework for Assessing the Concordance of Molecular Typing Methods and the True Strain Phylogeny of Campylobacter jejuni and C. coli Using Draft Genome Sequence Data , 2012, Front. Cell. Inf. Microbio..

[21]  J. Vázquez,et al.  Development of a Multilocus Sequence Typing Method for Analysis of Listeria monocytogenes Clones , 2003, Journal of Clinical Microbiology.

[22]  J. Losos,et al.  Listeria monocytogenes: a foodborne pathogen. , 1988, CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne.

[23]  Sergey I. Nikolenko,et al.  SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing , 2012, J. Comput. Biol..

[24]  A. Goesmann,et al.  Reassessment of the Listeria monocytogenes pan-genome reveals dynamic integration hotspots and mobile genetic elements as major components of the accessory genome , 2013, BMC Genomics.

[25]  D. Ussery,et al.  Genome Sequencing Identifies Two Nearly Unchanged Strains of Persistent Listeria monocytogenes Isolated at Two Different Fish Processing Plants Sampled 6 Years Apart , 2013, Applied and Environmental Microbiology.

[26]  M. Ramirez,et al.  A Confidence Interval for the Wallace Coefficient of Concordance and Its Application to Microbial Typing Methods , 2008, PloS one.

[27]  Joshua S. Paul,et al.  Genotype and SNP calling from next-generation sequencing data , 2011, Nature Reviews Genetics.

[28]  Stefan Niemann,et al.  Whole-Genome-Based Mycobacterium tuberculosis Surveillance: a Standardized, Portable, and Expandable Approach , 2014, Journal of Clinical Microbiology.

[29]  M. Allard,et al.  Genome Sequences of Listeria monocytogenes Strains J1816 and J1-220, Associated with Human Outbreaks , 2011, Journal of bacteriology.

[30]  Sergey I. Nikolenko,et al.  BayesHammer: Bayesian clustering for error correction in single-cell sequencing , 2012, BMC Genomics.

[31]  W. Goebel,et al.  Comparative genomics and transcriptomics of lineages I, II, and III strains of Listeria monocytogenes , 2012, BMC Genomics.

[32]  Michael Y. Galperin,et al.  The cyanobacterial genome core and the origin of photosynthesis , 2006, Proceedings of the National Academy of Sciences.

[33]  Torsten Seemann,et al.  Prokka: rapid prokaryotic genome annotation , 2014, Bioinform..

[34]  J. Farber,et al.  An Introduction to the Hows and Whys of Molecular Typing †. , 1996, Journal of food protection.

[35]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[36]  T. Rattei,et al.  Complete Genome Sequence of Listeria monocytogenes LL195, a Serotype 4b Strain from the 1983–1987 Listeriosis Epidemic in Switzerland , 2013, Genome Announcements.

[37]  F. Pagotto,et al.  Draft Genome Sequence of Listeria monocytogenes Strain LI0521 (syn. HPB7171), Isolated in 1983 during an Outbreak in Massachusetts Caused by Contaminated Cheese , 2014, Genome Announcements.

[38]  D. Skiest,et al.  Genome Sequence of Listeria monocytogenes 07PF0776, a Cardiotropic Serovar 4b Strain , 2012, Journal of bacteriology.

[39]  Tracy N. LaPorte,et al.  Multistate outbreak of Listeria monocytogenes infection linked to delicatessen turkey meat. , 2005, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[40]  Richard V Goering,et al.  Pulsed field gel electrophoresis: a review of application and interpretation in the molecular epidemiology of infectious disease. , 2010, Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases.

[41]  E. R. Hall,et al.  Listeria monocytogenes L Forms I. Induction, Maintenance, and Biological Characteristics , 1968, Journal of bacteriology.

[42]  K. Jolley,et al.  Genome sequence analyses show that Neisseria oralis is the same species as ‘Neisseria mucosa var. heidelbergensis’ , 2013, International journal of systematic and evolutionary microbiology.

[43]  Yi Chen,et al.  Distributed under Creative Commons Cc-by 4.0 an Evaluation of Alternative Methods for Constructing Phylogenies from Whole Genome Sequence Data: a Case Study with Salmonella Background , 2022 .

[44]  Wehr Hm Listeria monocytogenes--a current dilemma. , 1987 .

[45]  Michael Y. Galperin,et al.  Sequence ― Evolution ― Function: Computational Approaches in Comparative Genomics , 2010 .

[46]  Keith A. Jolley,et al.  Real-Time Genomic Epidemiological Evaluation of Human Campylobacter Isolates by Use of Whole-Genome Multilocus Sequence Typing , 2013, Journal of Clinical Microbiology.

[47]  M. Vergassola,et al.  The Listeria transcriptional landscape from saprophytism to virulence , 2009, Nature.

[48]  M. Borodovsky,et al.  GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. , 2001, Nucleic acids research.

[49]  J. Weis,et al.  Incidence of Listeria monocytogenes in nature. , 1975, Applied microbiology.

[50]  J. Webster,et al.  Sets of EcoRI fragments containing ribosomal RNA sequences are conserved among different strains of Listeria monocytogenes. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[51]  A. Killinger,et al.  Listeria monocytogenes and listeric infections. , 1966, Bacteriological reviews.

[52]  Steven Salzberg,et al.  Identifying bacterial genes and endosymbiont DNA with Glimmer , 2007, Bioinform..

[53]  P. Gerner-Smidt,et al.  Genomic Characterization of Listeria monocytogenes Strains Involved in a Multistate Listeriosis Outbreak Associated with Cantaloupe in US , 2012, PloS one.

[54]  M. Achtman,et al.  Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[55]  P. Hunter,et al.  Numerical index of the discriminatory ability of typing systems: an application of Simpson's index of diversity , 1988, Journal of clinical microbiology.