Alignment-Free Design of Highly Discriminatory Diagnostic Primer Sets for Escherichia coli O104:H4 Outbreak Strains

Background An Escherichia coli O104:H4 outbreak in Germany in summer 2011 caused 53 deaths, over 4000 individual infections across Europe, and considerable economic, social and political impact. This outbreak was the first in a position to exploit rapid, benchtop high-throughput sequencing (HTS) technologies and crowdsourced data analysis early in its investigation, establishing a new paradigm for rapid response to disease threats. We describe a novel strategy for design of diagnostic PCR primers that exploited this rapid draft bacterial genome sequencing to distinguish between E. coli O104:H4 outbreak isolates and other pathogenic E. coli isolates, including the historical hæmolytic uræmic syndrome (HUSEC) E. coli HUSEC041 O104:H4 strain, which possesses the same serotype as the outbreak isolates. Methodology/Principal Findings Primers were designed using a novel alignment-free strategy against eleven draft whole genome assemblies of E. coli O104:H4 German outbreak isolates from the E. coli O104:H4 Genome Analysis Crowd-Sourcing Consortium website, and a negative sequence set containing 69 E. coli chromosome and plasmid sequences from public databases. Validation in vitro against 21 ‘positive’ E. coli O104:H4 outbreak and 32 ‘negative’ non-outbreak EHEC isolates indicated that individual primer sets exhibited 100% sensitivity for outbreak isolates, with false positive rates of between 9% and 22%. A minimal combination of two primers discriminated between outbreak and non-outbreak E. coli isolates with 100% sensitivity and 100% specificity. Conclusions/Significance Draft genomes of isolates of disease outbreak bacteria enable high throughput primer design and enhanced diagnostic performance in comparison to traditional molecular assays. Future outbreak investigations will be able to harness HTS rapidly to generate draft genome sequences and diagnostic primer sets, greatly facilitating epidemiology and clinical diagnostics. We expect that high throughput primer design strategies will enable faster, more precise responses to future disease outbreaks of bacterial origin, and help to mitigate their societal impact.

[1]  Man Kit Cheung,et al.  2011 German Escherichia coli O104:H4 outbreak: whole-genome phylogeny without alignment , 2011, BMC Research Notes.

[2]  A. Mellmann,et al.  Characterisation of the Escherichia coli strain associated with an outbreak of haemolytic uraemic syndrome in Germany, 2011: a microbiological study. , 2011, The Lancet. Infectious diseases.

[3]  P. Woo,et al.  Rapid Identification and Validation of Specific Molecular Targets for Detection of Escherichia coli O104:H4 Outbreak Strain by Use of High-Throughput Sequencing Data from Nine Genomes , 2011, Journal of Clinical Microbiology.

[4]  Junhua Li,et al.  Open-source genomic analysis of Shiga-toxin-producing E. coli O104:H4. , 2011, The New England journal of medicine.

[5]  James H. Bullard,et al.  Origins of the E. coli strain causing an outbreak of hemolytic-uremic syndrome in Germany. , 2011, The New England journal of medicine.

[6]  J. Rothberg,et al.  Prospective Genomic Characterization of the German Enterohemorrhagic Escherichia coli O104:H4 Outbreak by Rapid Next Generation Sequencing Technology , 2011, PloS one.

[7]  Xianming Shi,et al.  A PCR METHOD FOR THE DETECTION OF LISTERIA MONOCYTOGENES BASED ON A NOVEL TARGET SEQUENCE IDENTIFIED BY COMPARATIVE GENOMIC ANALYSIS , 2010 .

[8]  Jaques Reifman,et al.  A high-throughput pipeline for the design of real-time PCR signatures , 2010, BMC Bioinformatics.

[9]  Miriam L. Land,et al.  Trace: Tennessee Research and Creative Exchange Prodigal: Prokaryotic Gene Recognition and Translation Initiation Site Identification Recommended Citation Prodigal: Prokaryotic Gene Recognition and Translation Initiation Site Identification , 2022 .

[10]  S. Nelson,et al.  BFAST: An Alignment Tool for Large Scale Genome Resequencing , 2009, PloS one.

[11]  Siu-Ming Yiu,et al.  SOAP2: an improved ultrafast tool for short read alignment , 2009, Bioinform..

[12]  Leighton Pritchard,et al.  Colonization outwith the colon: plants as an alternative environmental reservoir for human pathogenic enterobacteria. , 2009, FEMS microbiology reviews.

[13]  Adam M. Phillippy,et al.  Insignia: a DNA signature search web server for diagnostic assay development , 2009, Nucleic Acids Res..

[14]  Giorgio Valle,et al.  PASS: a program to align short sequences , 2009, Bioinform..

[15]  Xianming Shi,et al.  Identification of new target sequences for PCR detection of Vibrio parahaemolyticus by genome comparison , 2009 .

[16]  Josephine A. Reinhardt,et al.  De novo assembly using low-coverage short read sequence data from the rice pathogen Pseudomonas syringae pv. oryzae. , 2009, Genome research.

[17]  M. Tibayrenc Multilocus enzyme electrophoresis for parasites and other pathogens. , 2009, Methods in molecular biology.

[18]  Jaques Reifman,et al.  In silico microarray probe design for diagnosis of multiple pathogens , 2008, BMC Genomics.

[19]  A. Mellmann,et al.  Analysis of Collection of Hemolytic Uremic Syndrome–associated Enterohemorrhagic Escherichia coli , 2008, Emerging infectious diseases.

[20]  Adam M. Phillippy,et al.  Comprehensive DNA Signature Discovery and Validation , 2007, PLoS Comput. Biol..

[21]  M. Grisham,et al.  Early Detection of Leifsonia xyli subsp. xyli in Sugarcane Leaves by Real-Time Polymerase Chain Reaction. , 2007, Plant disease.

[22]  D. Knorr,et al.  A high throughput membrane BIO‐PCR technique for ultra‐sensitive detection of Pseudomonas syringae pv. phaseolicola , 2007 .

[23]  Mark J. Pallen,et al.  Bacterial pathogenomics , 2007, Nature.

[24]  M. Maiden Multilocus sequence typing of bacteria. , 2006, Annual review of microbiology.

[25]  Leighton Pritchard,et al.  Comparative genomics reveals what makes an enterobacterial plant pathogen. , 2006, Annual review of phytopathology.

[26]  Adam Zemla,et al.  Comparative Genomics Tools Applied to Bioterrorism Defence , 2003, Briefings Bioinform..

[27]  Even Heir,et al.  DNA Fingerprinting of Salmonella enterica subsp. enterica Serovar Typhimurium with Emphasis on Phage Type DT104 Based on Variable Number of Tandem Repeat Loci , 2003, Journal of Clinical Microbiology.

[28]  N. Schaad,et al.  Real-Time Polymerase Chain Reaction for One-Hour On-Site Diagnosis of Pierce's Disease of Grape in Early Season Asymptomatic Vines. , 2002, Phytopathology.

[29]  B. Mayall,et al.  Genomic approaches to typing, taxonomy and evolution of bacterial isolates. , 2001, International journal of systematic and evolutionary microbiology.

[30]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[31]  S Rozen,et al.  Primer3 on the WWW for general users and for biologist programmers. , 2000, Methods in molecular biology.

[32]  H. Tsen,et al.  Development and use of a multiplex PCR system for the rapid screening of heat labile toxin I, heat stable toxin II and shiga‐like toxin I and II genes of Escherichia coli in water , 1998, Journal of applied microbiology.

[33]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.