Prediction System for Rapid Identification of Salmonella Serotypes Based on Pulsed-Field Gel Electrophoresis Fingerprints

ABSTRACT A classification model is presented for rapid identification of Salmonella serotypes based on pulsed-field gel electrophoresis (PFGE) fingerprints. The classification model was developed using random forest and support vector machine algorithms and was then applied to a database of 45,923 PFGE patterns, randomly selected from all submissions to CDC PulseNet from 2005 to 2010. The patterns selected included the top 20 most frequent serotypes and 12 less frequent serotypes from various sources. The prediction accuracies for the 32 serotypes ranged from 68.8% to 99.9%, with an overall accuracy of 96.0% for the random forest classification, and ranged from 67.8% to 100.0%, with an overall accuracy of 96.1% for the support vector machine classification. The prediction system improves reliability and accuracy and provides a new tool for early and fast screening and source tracking of outbreak isolates. It is especially useful to get serotype information before the conventional methods are done. Additionally, this system also works well for isolates that are serotyped as “unknown” by conventional methods, and it is useful for a laboratory where standard serotyping is not available.

[1]  S. Nair,et al.  Supplement 2008-2010 (no. 48) to the White-Kauffmann-Le Minor scheme. , 2014, Research in microbiology.

[2]  D H Persing,et al.  Interpreting chromosomal DNA restriction patterns produced by pulsed-field gel electrophoresis: criteria for bacterial strain typing , 1995, Journal of clinical microbiology.

[3]  P. Fields,et al.  Multiplex, Bead-Based Suspension Array for Molecular Determination of Common Salmonella Serogroups , 2007, Journal of Clinical Microbiology.

[4]  Shaohua Zhao,et al.  Comparison of molecular typing methods for the differentiation of Salmonella foodborne pathogens. , 2007, Foodborne pathogens and disease.

[5]  C. Braden,et al.  Recipes for foodborne outbreaks: a scheme for categorizing and grouping implicated foods. , 2009, Foodborne pathogens and disease.

[6]  F. Weill,et al.  Supplement 2003-2007 (No. 47) to the White-Kauffmann-Le Minor scheme. , 2010, Research in microbiology.

[7]  B. Swaminathan,et al.  PulseNet USA: a five-year update. , 2006, Foodborne pathogens and disease.

[8]  John B. Luchansky,et al.  Use of Pulsed-Field Gel Electrophoresis To Characterize the Heterogeneity and Clonality of Salmonella Isolates Obtained from the Carcasses and Feces of Swine at Slaughter , 2003, Applied and Environmental Microbiology.

[9]  I R Richardson,et al.  Pulsed field gel electrophoresis identifies an outbreak of Salmonella enterica serotype Montevideo infection associated with a supermarket hot food outlet. , 1999, Communicable disease and public health.

[10]  Pierre Wattiau,et al.  Methodologies for Salmonella enterica subsp. enterica Subtyping: Gold Standards and Alternatives , 2011, Applied and Environmental Microbiology.

[11]  F. Kauffmann,et al.  Classification and Nomenclature of Enterobacteriaceae , 1952 .

[12]  B. Seal,et al.  Predicting Salmonella enterica serotypes by repetitive sequence-based PCR. , 2009, Journal of microbiological methods.

[13]  O. Colin Stine,et al.  Multilocus Sequence Typing for Characterization of Clinical and Environmental Salmonella Strains , 2002, Journal of Clinical Microbiology.

[14]  Hongshik Ahn,et al.  Classification methods for the development of genomic signatures from high-dimensional data , 2006, Genome Biology.

[15]  E. J. Threlfall,et al.  Development of a Multiplex Primer Extension Assay for Rapid Detection of Salmonella Isolates of Diverse Serotypes , 2010, Journal of Clinical Microbiology.

[16]  James J. Chen,et al.  An FDA bioinformatics tool for microbial genomics research on molecular characterization of bacterial foodborne pathogens using microarrays , 2010, BMC Bioinformatics.

[17]  Stephen B. Gaul,et al.  Use of Pulsed-Field Gel Electrophoresis of Conserved XbaI Fragments for Identification of Swine Salmonella Serotypes , 2006, Journal of Clinical Microbiology.

[18]  James J. Chen,et al.  Development of biomarker classifiers from high-dimensional data , 2009, Briefings Bioinform..

[19]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[20]  A. Rementeria,et al.  Suitability of PCR Fingerprinting, Infrequent-Restriction-Site PCR, and Pulsed-Field Gel Electrophoresis, Combined with Computerized Gel Analysis, in Library Typing of Salmonella enterica Serovar Enteritidis , 2000, Applied and Environmental Microbiology.

[21]  Wen Zou,et al.  Evaluation of Pulsed-Field Gel Electrophoresis Profiles for Identification of Salmonella Serotypes , 2010, Journal of Clinical Microbiology.

[22]  E Liebana,et al.  Molecular Typing of SalmonellaSerotypes Prevalent in Animals in England: Assessment of Methodology , 2001, Journal of Clinical Microbiology.

[23]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[24]  K. Balakrishna,et al.  Detection of Salmonella enterica serovar Typhi (S. Typhi) by selective amplification of invA, viaB, fliC‐d and prt genes by polymerase chain reaction in mutiplex format , 2006, Letters in applied microbiology.

[25]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[26]  Jae Won Lee,et al.  An extensive comparison of recent classification tools applied to microarray data , 2004, Comput. Stat. Data Anal..