A bioinformatic filter for improved base-call accuracy and polymorphism detection using the Affymetrix GeneChip® whole-genome resequencing platform

DNA resequencing arrays enable rapid acquisition of high-quality sequence data. This technology represents a promising platform for rapid high-resolution genotyping of microorganisms. Traditional array-based resequencing methods have relied on the use of specific PCR-amplified fragments from the query samples as hybridization targets. While this specificity in the target DNA population reduces the potential for artifacts caused by cross-hybridization, the subsampling of the query genome limits the sequence coverage that can be obtained and therefore reduces the technique's resolution as a genotyping method. We have developed and validated an Affymetrix Inc. GeneChip® array-based, whole-genome resequencing platform for Francisella tularensis, the causative agent of tularemia. A set of bioinformatic filters that targeted systematic base-calling errors caused by cross-hybridization between the whole-genome sample and the array probes and by deletions in the sample DNA relative to the chip reference sequence were developed. Our approach eliminated 91% of the false-positive single-nucleotide polymorphism calls identified in the SCHU S4 query sample, at the cost of 10.7% of the true positives, yielding a total base-calling accuracy of 99.992%.

[1]  V. Daubin,et al.  Comparative genomics and the evolution of prokaryotes. , 2007, Trends in microbiology.

[2]  S. Sammons,et al.  GeneChip Resequencing of the Smallpox Virus Genome Can Identify Novel Strains: a Biodefense Application , 2006, Journal of Clinical Microbiology.

[3]  P. Rota,et al.  Evaluation of Affymetrix Severe Acute Respiratory Syndrome Resequencing GeneChips in Characterization of the Genomes of Two Strains of Coronavirus Infecting Humans , 2006, Applied and Environmental Microbiology.

[4]  C. Fraser-Liggett,et al.  Insights on biology and evolution from microbial genome sequencing. , 2005, Genome research.

[5]  Baochuan Lin,et al.  Use of Resequencing Oligonucleotide Microarrays for Identification of Streptococcus pyogenes and Associated Antibiotic Resistance Determinants , 2005, Journal of Clinical Microbiology.

[6]  Jaideep P. Sundaram,et al.  Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: implications for the microbial "pan-genome". , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[7]  T. Cebula,et al.  Chips and SNPs, bugs and thugs: a molecular sleuthing perspective. , 2005, Journal of food protection.

[8]  J. Brockmöller,et al.  Genome‐wide single‐nucleotide polymorphism arrays demonstrate high fidelity of multiple displacement‐based whole‐genome amplification , 2005, Electrophoresis.

[9]  D. Cutler,et al.  Microarray-based resequencing of multiple Bacillus anthracis isolates , 2004, Genome Biology.

[10]  S. Sommer,et al.  Assessment of multiple displacement amplification in molecular epidemiology. , 2004, BioTechniques.

[11]  Vladimir Makarov,et al.  Two methods of whole-genome amplification enable accurate genotyping across a 2320-SNP linkage panel. , 2004, Genome research.

[12]  A. Chakravarti,et al.  The Human MitoChip: a high-throughput sequencing microarray for mitochondrial mutation detection. , 2004, Genome research.

[13]  J. Shendure,et al.  Advanced sequencing technologies: methods and goals , 2004, Nature Reviews Genetics.

[14]  Christopher W. Wong,et al.  Tracking the evolution of the SARS coronavirus using high-throughput, high-density resequencing arrays. , 2004, Genome research.

[15]  F. Cohen,et al.  Expression profiling of the schizont and trophozoite stages of Plasmodium falciparum with a long-oligonucleotide microarray , 2003, Genome Biology.

[16]  Alyssa C. Bumbaugh,et al.  Inferences from whole-genome sequences of bacterial pathogens. , 2002, Current opinion in genetics & development.

[17]  Mihai Pop,et al.  Comparative Genome Sequencing for Discovery of Novel Polymorphisms in Bacillus anthracis , 2002, Science.

[18]  S. Kingsmore,et al.  Comprehensive human genome amplification using multiple displacement amplification , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[19]  A Chakravarti,et al.  High-throughput variation detection and genotyping using microarrays. , 2001, Genome research.

[20]  F. Dean,et al.  Rapid amplification of plasmid and phage DNA using Phi 29 DNA polymerase and multiply-primed rolling circle amplification. , 2001, Genome research.

[21]  James L. Winkler,et al.  Accessing Genetic Information with High-Density DNA Arrays , 1996, Science.

[22]  Genomes and evolution , 2006 .

[23]  J. Ecker,et al.  Applications of DNA tiling arrays for whole-genome analysis. , 2005, Genomics.

[24]  S Rozen,et al.  Primer3 on the WWW for general users and for biologist programmers. , 2000, Methods in molecular biology.

[25]  S. Salzberg,et al.  Alignment of whole genomes. , 1999, Nucleic acids research.

[26]  J. Hacia Resequencing and mutational analysis using oligonucleotide microarrays , 1999, Nature Genetics.