Bioinformatics in bacterial molecular epidemiology and public health: databases, tools and the next-generation sequencing revolution.

Advances in typing methodologies have been the driving force in the field of molecular epidemiology of pathogens. The development of molecular methodologies, and more recently of DNA sequencing methods to complement and improve phenotypic identification methods, was accompanied by the generation of large amounts of data and the need to develop ways of storing and analysing them. Simultaneously, advances in computing allowed the development of specialised algorithms for image analysis, data sharing and integration, and for mining the ever larger amounts of accumulated data. In this review, we will discuss how bioinformatics accompanied the changes in bacterial molecular epidemiology. We will discuss the benefits for public health of specialised online typing databases and algorithms allowing for real-time data analysis and visualisation. The impact of the new and disruptive next-generation sequencing methodologies will be evaluated, and we will look ahead into these novel challenges.

[1]  E. H. Simpson Measurement of Diversity , 1949, Nature.

[2]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[3]  R. Sokal,et al.  Numerical Taxonomy: The Principles and Practice of Numerical Classification. , 1975 .

[4]  Ian T. Jolliffe,et al.  A Method for Comparing Two Hierarchical Clusterings: Comment , 1983 .

[5]  L. Hubert,et al.  Comparing partitions , 1985 .

[6]  K. Livak,et al.  DNA polymorphisms amplified by arbitrary primers are useful as genetic markers. , 1990, Nucleic acids research.

[7]  P. Vos,et al.  AFLP: a new technique for DNA fingerprinting. , 1995, Nucleic acids research.

[8]  M. Struelens Consensus guidelines for appropriate use and evaluation of microbial epidemiologic typing systems. , 1996, Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases.

[9]  M. Achtman,et al.  Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[10]  B. Spratt Multilocus sequence typing: molecular typing of bacterial pathogens in an era of rapid DNA sequencing and the internet. , 1999, Current opinion in microbiology.

[11]  E. Feil,et al.  Population structure and evolutionary dynamics of pathogenic bacteria , 2000, BioEssays : news and reviews in molecular, cellular and developmental biology.

[12]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[13]  E. Holmes,et al.  Recombination within natural populations of pathogenic bacteria: short-term empirical estimates and long-term phylogenetic consequences. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[14]  B. Swaminathan,et al.  PulseNet: the molecular subtyping network for foodborne bacterial disease surveillance, United States. , 2001, Emerging infectious diseases.

[15]  Tim Berners-Lee,et al.  Publishing on the semantic web , 2001, Nature.

[16]  Alex van Belkum,et al.  Role of Genomic Typing in Taxonomy, Evolutionary Genetics, and Microbial Epidemiology , 2001, Clinical Microbiology Reviews.

[17]  Gregor Tanner,et al.  Determining Confidence Intervals When Measuring Genetic Diversity and the Discriminatory Abilities of Typing Methods for Microorganisms , 2001, Journal of Clinical Microbiology.

[18]  J. Rothgänger,et al.  Typing of Methicillin-Resistant Staphylococcus aureus in a University Hospital Setting by Using Novel Software for spa Repeat Determination and Database Management , 2003, Journal of Clinical Microbiology.

[19]  J. Zenilman,et al.  Porin variation among clinical isolates of Neisseria gonorrhoeae over a 10-year period, as determined by Por variable region typing. , 2003, The Journal of infectious diseases.

[20]  W. Hanage,et al.  eBURST: Inferring Patterns of Evolutionary Descent among Clusters of Related Bacterial Genotypes from Multilocus Sequence Typing Data , 2004, Journal of bacteriology.

[21]  J. Bard,et al.  Ontologies in biology: design, applications and future challenges , 2004, Nature Reviews Genetics.

[22]  J A Carriço,et al.  Assessment of Band-Based Similarity Coefficients for Automatic Type and Subtype Classification of Microbial Isolates Analyzed by Pulsed-Field Gel Electrophoresis , 2005, Journal of Clinical Microbiology.

[23]  Jens Stoye,et al.  Comparing Tandem Repeats with Duplications and Excisions of Variable Degree , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[24]  Helena F. Deus,et al.  Data integration gets 'Sloppy' , 2006, Nature Biotechnology.

[25]  J. Corander,et al.  Bayesian identification of admixture events using multilocus molecular markers , 2006, Molecular ecology.

[26]  M. Maiden Multilocus sequence typing of bacteria. , 2006, Annual review of microbiology.

[27]  Mark Achtman,et al.  Evolutionary History of Salmonella Typhi , 2006, Science.

[28]  B. Swaminathan,et al.  Building PulseNet International: an interconnected system of laboratory networks to facilitate timely public health recognition and response to foodborne disease outbreaks and emerging foodborne diseases. , 2006, Foodborne pathogens and disease.

[29]  J. Stoye,et al.  Based Upon Repeat Pattern (BURP): an algorithm to characterize the long-term evolution of Staphylococcus aureus populations based on spa polymorphisms , 2007, BMC Microbiology.

[30]  Thomas R Connor,et al.  Bmc Microbiology Assessing the Reliability of Eburst Using Simulated Populations with Known Ancestry , 2022 .

[31]  D. Falush,et al.  Inference of Bacterial Microevolution Using Multilocus Sequence Data , 2007, Genetics.

[32]  Jukka Corander,et al.  Bayesian analysis of population structure based on linked molecular information. , 2007, Mathematical biosciences.

[33]  João André Carriço,et al.  Analysis of Typing Methods for Epidemiological Surveillance of both Methicillin-Resistant and Methicillin-Susceptible Staphylococcus aureus Strains , 2007, Journal of Clinical Microbiology.

[34]  João André Carriço,et al.  Comparison of Molecular Typing Methods for Characterization of Staphylococcus epidermidis: Proposal for Clone Definition , 2007, Journal of Clinical Microbiology.

[35]  M. Ramirez,et al.  A Confidence Interval for the Wallace Coefficient of Concordance and Its Application to Microbial Typing Methods , 2008, PloS one.

[36]  Stefan Niemann,et al.  Evaluation and Strategy for Use of MIRU-VNTRplus, a Multifunctional Database for Online Analysis of Genotyping Data and Phylogenetic Identification of Mycobacterium tuberculosis Complex Isolates , 2008, Journal of Clinical Microbiology.

[37]  G. Vergnaud,et al.  On-line resources for bacterial micro-evolution studies using MLVA or CRISPR typing. , 2008, Biochimie.

[38]  R. Goering,et al.  Usefulness of mec-associated direct repeat unit (dru) typing in the epidemiological analysis of highly clonal methicillin-resistant Staphylococcus aureus in Scotland. , 2008, Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases.

[39]  Mirjam Feldkamp,et al.  Frequent emergence and limited geographic dispersal of methicillin-resistant Staphylococcus aureus , 2008, Proceedings of the National Academy of Sciences.

[40]  Arlindo L. Oliveira,et al.  CcrB typing tool: an online resource for staphylococci ccrB sequence typing. , 2008, The Journal of antimicrobial chemotherapy.

[41]  Panagiotis Deloukas,et al.  High-Throughput Genotyping of Salmonella enterica Serovar Typhi Allowing Geographical Assignment of Haplotypes and Pathotypes within an Urban District of Jakarta, Indonesia , 2008, Journal of Clinical Microbiology.

[42]  Falk Hildebrand,et al.  Origin, Spread and Demography of the Mycobacterium tuberculosis Complex , 2008, PLoS pathogens.

[43]  S. Brisse,et al.  MLVA-NET--a standardised web database for bacterial genotyping and surveillance. , 2008, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[44]  Alexandre P. Francisco,et al.  Global optimal eBURST analysis of multilocus typing data using a graphic matroid approach , 2009, BMC Bioinformatics.

[45]  David M. Aanensen,et al.  EpiCollect: Linking Smartphones to Web Applications for Epidemiology, Ecology and Community Data Collection , 2009, PloS one.

[46]  Jukka Corander,et al.  Hyper-Recombination, Diversity, and Antibiotic Resistance in Pneumococcus , 2009, Science.

[47]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[48]  Jukka Corander,et al.  Identifying Currents in the Gene Pool for Bacterial Populations Using an Integrative Approach , 2009, PLoS Comput. Biol..

[49]  L. Schouls,et al.  Multiple-Locus Variable Number Tandem Repeat Analysis of Staphylococcus Aureus: Comparison with Pulsed-Field Gel Electrophoresis and spa-Typing , 2009, PloS one.

[50]  Vincent Moulton,et al.  RDP3: a flexible and fast computer program for analyzing recombination , 2010, Bioinform..

[51]  Martin C. J. Maiden,et al.  BIGSdb: Scalable analysis of bacterial genome variation at the population level , 2010, BMC Bioinformatics.

[52]  Julian Parkhill,et al.  Evolution of MRSA During Hospital Transmission and Intercontinental Spread , 2010, Science.

[53]  Alexandre P. Francisco,et al.  An Ontology and a REST API for Sequence Based Microbial Typing Data , 2010, JBI.

[54]  D. Bessen,et al.  Population Genetics of Streptococcus dysgalactiae Subspecies equisimilis Reveals Widely Dispersed Clones and Extensive Recombination , 2010, PloS one.

[55]  B. Spratt,et al.  Geographic Distribution of Staphylococcus aureus Causing Invasive Infections in Europe: A Molecular-Epidemiological Analysis , 2010, PLoS medicine.

[56]  Kathryn E Holt,et al.  Navigating the future of bacterial molecular epidemiology , 2010, Current opinion in microbiology.

[57]  Stefan Niemann,et al.  MIRU-VNTRplus: a web tool for polyphasic genotyping of Mycobacterium tuberculosis complex bacteria , 2010, Nucleic Acids Res..

[58]  João A. Carriço,et al.  Evaluation of Jackknife and Bootstrap for Defining Confidence Intervals for Pairwise Agreement Measures , 2011, PloS one.

[59]  João André Carriço,et al.  Adjusted Wallace Coefficient as a Measure of Congruence between Typing Methods , 2011, Journal of Clinical Microbiology.

[60]  Junhua Li,et al.  Open-source genomic analysis of Shiga-toxin-producing E. coli O104:H4. , 2011, The New England journal of medicine.

[61]  Ramanan Laxminarayan,et al.  A framework for global surveillance of antibiotic resistance. , 2011, Drug resistance updates : reviews and commentaries in antimicrobial and anticancer chemotherapy.

[62]  J. Rothberg,et al.  Prospective Genomic Characterization of the German Enterohemorrhagic Escherichia coli O104:H4 Outbreak by Rapid Next Generation Sequencing Technology , 2011, PloS one.

[63]  James H. Bullard,et al.  Origins of the E. coli strain causing an outbreak of hemolytic-uremic syndrome in Germany. , 2011, The New England journal of medicine.

[64]  Hoon Kim,et al.  Multiple-locus Variable-number Tandem Repeat 분석을 , 2011 .

[65]  Alexandre P. Francisco,et al.  PHYLOViZ: phylogenetic inference and data visualization for sequence based typing methods , 2012, BMC Bioinformatics.

[66]  M. Nei,et al.  MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. , 2011, Molecular biology and evolution.

[67]  J. Burton,et al.  Rapid Pneumococcal Evolution in Response to Clinical Interventions , 2011, Science.

[68]  J. Rougemont,et al.  Probable zoonotic leprosy in the southern United States. , 2011, The New England journal of medicine.

[69]  J. Foster,et al.  Epidemiological Tracking and Population Assignment of the Non-Clonal Bacterium, Burkholderia pseudomallei , 2011, PLoS neglected tropical diseases.

[70]  H. Grundmann,et al.  Molecular epidemiology of human pathogens: how to translate breakthroughs into public health practice, Stockholm, November 2011. , 2012, Euro surveillance : bulletin Europeen sur les maladies transmissibles = European communicable disease bulletin.

[71]  G. Dougan,et al.  Routine Use of Microbial Whole Genome Sequencing in Diagnostic and Public Health Microbiology , 2012, PLoS pathogens.

[72]  Daniel J. Wilson,et al.  A pilot study of rapid benchtop sequencing of Staphylococcus aureus and Clostridium difficile for outbreak detection and surveillance , 2012, BMJ Open.

[73]  N. Loman,et al.  High-throughput bacterial genome sequencing: an embarrassment of choice, a world of opportunity , 2012, Nature Reviews Microbiology.

[74]  W. M. Dunne,et al.  Next-generation and whole-genome sequencing in the diagnostic clinical microbiology laboratory , 2012, European Journal of Clinical Microbiology & Infectious Diseases.

[75]  Julian Parkhill,et al.  Rapid whole-genome sequencing for investigation of a neonatal MRSA outbreak. , 2012, The New England journal of medicine.

[76]  J. V. van Dijl,et al.  Microfluidic-Chip-Based Multiple-Locus Variable-Number Tandem-Repeat Fingerprinting with New Primer Sets for Methicillin-Resistant Staphylococcus aureus , 2012, Journal of Clinical Microbiology.

[77]  F. Allerberger Molecular Typing in Public Health Laboratories: From an Academic Indulgence to an Infection Control Imperative , 2012, Journal of preventive medicine and public health = Yebang Uihakhoe chi.

[78]  Keith A. Jolley,et al.  Ribosomal multilocus sequence typing: universal characterization of bacteria from domain to strain , 2012, Microbiology.

[79]  A HarfoushRAOmarNYAliH,et al.  Molecular typing of methicillin resistant Staphylococcus aureus clinical isolates on the basis of protein a and coagulase gene polymorphisms , 2014 .