Comprehensive DNA Signature Discovery and Validation

DNA signatures are nucleotide sequences that can be used to detect the presence of an organism and to distinguish that organism from all other species. Here we describe Insignia, a new, comprehensive system for the rapid identification of signatures in the genomes of bacteria and viruses. With the availability of hundreds of complete bacterial and viral genome sequences, it is now possible to use computational methods to identify signature sequences in all of these species, and to use these signatures as the basis for diagnostic assays to detect and genotype microbes in both environmental and clinical samples. The success of such assays critically depends on the methods used to identify signatures that properly differentiate between the target genomes and the sample background. We have used Insignia to compute accurate signatures for most bacterial genomes and made them available through our Web site. A sample of these signatures has been successfully tested on a set of 46 Vibrio cholerae strains, and the results indicate that the signatures are highly sensitive for detection as well as specific for discrimination between these strains and their near relatives. Our approach, whereby the entire genomic complement of organisms are compared to identify probe targets, is a promising method for diagnostic assay development, and it provides assay designers with the flexibility to choose probes from the most relevant genes or genomic regions. The Insignia system is freely accessible via a Web interface and has been released as open source software at: http://insignia.cbcb.umd.edu.

[1]  J. Patrick Fitch,et al.  Rapid development of nucleic acid diagnostics , 2002, Proc. IEEE.

[2]  Shea N. Gardner,et al.  Sequencing Needs for Viral Diagnostics , 2004, Journal of Clinical Microbiology.

[3]  Steve B. Brown,et al.  Autonomous detection of aerosolized Bacillus anthracis and Yersinia pestis. , 2003, Analytical chemistry.

[4]  Kathryn Brown,et al.  Up in the Air , 2004, Science.

[5]  Gary Benson,et al.  Sequence analysis Oligonucleotide fingerprint identification for microarray-based pathogen diagnostic assays , 2006 .

[6]  S Rozen,et al.  Primer3 on the WWW for general users and for biologist programmers. , 2000, Methods in molecular biology.

[7]  K. Livak,et al.  Oligonucleotides with fluorescent dyes at opposite ends provide a quenched probe system useful for detecting PCR product and nucleic acid hybridization. , 1995, PCR methods and applications.

[8]  Eugene L. Lawler,et al.  Sublinear Expected Time Approximate String Matching and Biological , 1991 .

[9]  L. Price,et al.  Multiple-Locus Variable-Number Tandem Repeat Analysis Reveals Genetic Relationships within Bacillus anthracis , 2000, Journal of bacteriology.

[10]  D. Lim,et al.  Current and Developing Technologies for Monitoring Agents of Bioterrorism and Biowarfare , 2005, Clinical Microbiology Reviews.

[11]  David Norwood,et al.  Multiplexed detection of anthrax-related toxin genes. , 2006, The Journal of molecular diagnostics : JMD.

[12]  Enno Ohlebusch,et al.  Efficient multiple genome alignment , 2002, ISMB.

[13]  Dan Gusfield Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[14]  Eugene L. Lawler,et al.  Sublinear approximate string matching and biological applications , 1994, Algorithmica.

[15]  J. Derisi,et al.  Microarray-based detection and genotyping of viral pathogens , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Joseph L DeRisi,et al.  E-Predict: a computational strategy for species identification based on observed DNA microarray hybridization patterns , 2005, Genome Biology.

[17]  Christoph W Sensen,et al.  Osprey: a comprehensive tool employing novel methods for the design of oligonucleotides for DNA sequencing and microarrays. , 2004, Nucleic acids research.

[18]  Gary D. Stormo,et al.  Selection of optimal DNA oligos for gene expression arrays , 2001, Bioinform..

[19]  S. Salzberg,et al.  Fast algorithms for large-scale genome alignment and comparison. , 2002, Nucleic acids research.

[20]  Alan Willse,et al.  Quantitative oligonucleotide microarray fingerprinting of Salmonella enterica isolates. , 2004, Nucleic acids research.

[21]  L. Price,et al.  Molecular diversity in Bacillus anthracis , 1999, Journal of applied microbiology.

[22]  Avraham Rasooly,et al.  Identification of Bacillus anthracis by multiprobe microarray hybridization. , 2004, Diagnostic microbiology and infectious disease.

[23]  Sven Rahmann Fast and sensitive probe selection for DNA chips using jumps in matching statistics , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[24]  S. Salzberg,et al.  Alignment of whole genomes. , 1999, Nucleic acids research.

[25]  J. Fitch,et al.  Technology Challenges in Responding to Biological or Chemical Attacks in the Civilian Sector , 2003, Science.

[26]  Adam Zemla,et al.  Comparative Genomics Tools Applied to Bioterrorism Defence , 2003, Briefings Bioinform..

[27]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[28]  Alexander Schliep,et al.  Selecting signature oligonucleotides to identify organisms using DNA arrays , 2002, Bioinform..

[29]  Tatiana Tatusova,et al.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins , 2004, Nucleic Acids Res..

[30]  Eric K. Nordberg,et al.  YODA: selecting signature oligonucleotides , 2005, Bioinform..

[31]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[32]  Thomas A. Kuczmarski,et al.  Limitations of TaqMan PCR for Detecting Divergent Viral Pathogens Illustrated by Hepatitis A, B, C, and E Viruses and Human Immunodeficiency Virus , 2003, Journal of Clinical Microbiology.

[33]  James J. Valdes,et al.  Real-Time Fluorogenic Reverse Transcription-PCR Assays for Detection of Bacteriophage MS2 , 2006, Applied and Environmental Microbiology.