Comprehensive viral oligonucleotide probe design using conserved protein regions

Oligonucleotide microarrays have been applied to microbial surveillance and discovery where highly multiplexed assays are required to address a wide range of genetic targets. Although printing density continues to increase, the design of comprehensive microbial probe sets remains a daunting challenge, particularly in virology where rapid sequence evolution and database expansion confound static solutions. Here, we present a strategy for probe design based on protein sequences that is responsive to the unique problems posed in virus detection and discovery. The method uses the Protein Families database (Pfam) and motif finding algorithms to identify oligonucleotide probes in conserved amino acid regions and untranslated sequences. In silico testing using an experimentally derived thermodynamic model indicated near complete coverage of the viral sequence database.

[1]  R. Shafer Rationale and uses of a public HIV drug-resistance database. , 2006, The Journal of infectious diseases.

[2]  W. Ian Lipkin,et al.  Greene SCPrimer: a rapid comprehensive tool for designing degenerate primers from multiple sequence alignments , 2006, Nucleic acids research.

[3]  S. Altschul Amino acid substitution matrices from an information theoretic perspective , 1991, Journal of Molecular Biology.

[4]  Majid Laassri,et al.  Detection and discrimination of orthopoxviruses using microarrays of immobilized oligonucleotides , 2003, Journal of virological methods.

[5]  J. Derisi,et al.  Pan-Viral Screening of Respiratory Tract Infections in Adults With and Without Asthma Reveals Unexpected Human Coronavirus and Human Rhinovirus Diversity , 2007, The Journal of infectious diseases.

[6]  Terence P. Speed,et al.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias , 2003, Bioinform..

[7]  Jorng-Tzong Horng,et al.  Database to Dynamically Aid Probe Design for Virus Identification , 2006, IEEE Transactions on Information Technology in Biomedicine.

[8]  Felix Naef,et al.  Absolute mRNA concentrations from sequence-specific calibration of oligonucleotide arrays. , 2003, Nucleic acids research.

[9]  A D Tsodikov,et al.  Thermodynamic calculations and statistical correlations for oligo-probes design. , 2003, Nucleic acids research.

[10]  Joseph L. DeRisi,et al.  Microarray Detection of Human Parainfluenzavirus 4 Infection Associated with Respiratory Failure in an Immunocompetent Adult , 2006, Clinical infectious diseases : an official publication of the Infectious Diseases Society of America.

[11]  P. Kilmarx,et al.  Global epidemiology of HIV , 2009, Current opinion in HIV and AIDS.

[12]  Christopher W. Wong,et al.  Tracking the evolution of the SARS coronavirus using high-throughput, high-density resequencing arrays. , 2004, Genome research.

[13]  K. Madagan,et al.  Detection of potato viruses using microarray technology: towards a generic method for plant viral disease diagnosis. , 2003, Journal of virological methods.

[14]  S. Sammons,et al.  GeneChip Resequencing of the Smallpox Virus Genome Can Identify Novel Strains: a Biodefense Application , 2006, Journal of Clinical Microbiology.

[15]  Henrik Bjørn Nielsen,et al.  Improving comparability between microarray probe signals by thermodynamic intensity correction. , 2007, Nucleic acids research.

[16]  R. Durbin,et al.  Pfam: A comprehensive database of protein domain families based on seed alignments , 1997, Proteins.

[17]  Anne Condon,et al.  RNAsoft: a suite of RNA secondary structure prediction and design software tools , 2003, Nucleic Acids Res..

[18]  M. O. Dayhoff,et al.  Atlas of protein sequence and structure , 1965 .

[19]  Jason Hinds,et al.  DNA Microarrays for Virus Detection in Cases of Central Nervous System Infection , 2004, Journal of Clinical Microbiology.

[20]  J. Derisi,et al.  Microarray-based detection and genotyping of viral pathogens , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[21]  Detlef D. Leipe,et al.  National Center for Biotechnology Information Viral Genomes Project , 2004, Journal of Virology.

[22]  Roland L. Dunbrack Sequence comparison and protein structure prediction. , 2006, Current opinion in structural biology.

[23]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[24]  Yang Liu,et al.  Panmicrobial Oligonucleotide Array for Diagnosis of Infectious Diseases , 2007, Emerging infectious diseases.

[25]  G L Andersen,et al.  Sequence-specific identification of 18 pathogenic microorganisms using microarray technology. , 2002, Molecular and cellular probes.

[26]  J. Derisi,et al.  Identification of a Novel Gammaretrovirus in Prostate Tumors of Patients Homozygous for R462Q RNASEL Variant , 2006, PLoS pathogens.

[27]  Chun-Houh Chen,et al.  Design of microarray probes for virus identification and detection of emerging viruses at the genus level , 2006, BMC Bioinformatics.

[28]  Sean R. Eddy,et al.  Profile hidden Markov models , 1998, Bioinform..

[29]  David L. Hirschberg,et al.  Detection of Respiratory Viruses and Subtype Identification of Influenza A Viruses by GreeneChipResp Oligonucleotide Microarray , 2007, Journal of Clinical Microbiology.

[30]  M. Zuker Calculating nucleic acid secondary structure. , 2000, Current opinion in structural biology.

[31]  James M. Eldred,et al.  Viral Discovery and Sequence Recovery Using DNA Microarrays , 2003, PLoS biology.

[32]  James A. Smagala,et al.  Experimental Evaluation of the FluChip Diagnostic Microarray for Influenza Virus Surveillance , 2006, Journal of Clinical Microbiology.

[33]  Dan Wu,et al.  EMBL Nucleotide Sequence Database: developments in 2005 , 2005, Nucleic Acids Res..

[34]  David A Stenger,et al.  Broad-spectrum respiratory tract pathogen identification using resequencing DNA microarrays. , 2006, Genome research.

[35]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[36]  Adam Godzik,et al.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences , 2006, Bioinform..

[37]  Timothy L. Bailey,et al.  An artificial intelligence approach to motif discovery in protein sequences: Application to steroid dehydrogenases , 1997, The Journal of Steroid Biochemistry and Molecular Biology.

[38]  G. Grinstein,et al.  Modeling of DNA microarray data by using physical properties of hybridization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[39]  James A. Smagala,et al.  Robust Sequence Selection Method Used To Develop the FluChip Diagnostic Microarray for Influenza Virus , 2006, Journal of Clinical Microbiology.

[40]  Ash A. Alizadeh,et al.  Diagnosis of a Critical Respiratory Illness Caused by Human Metapneumovirus by Use of a Pan-Virus Microarray , 2007, Journal of Clinical Microbiology.

[41]  Yudong D. He,et al.  Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer , 2001, Nature Biotechnology.

[42]  V. Chizhikov,et al.  Detection and Genotyping of Human Group A Rotaviruses by Oligonucleotide Microarray Hybridization , 2002, Journal of Clinical Microbiology.

[43]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[44]  J. SantaLucia,et al.  A unified view of polymer, dumbbell, and oligonucleotide DNA nearest-neighbor thermodynamics. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[45]  S. Lee,et al.  Correlation of cervical carcinoma and precancerous lesions with human papillomavirus (HPV) genotypes detected with the HPV DNA chip microarray method , 2003, Cancer.