Novel computational methods for increasing PCR primer design effectiveness in directed sequencing

BackgroundPolymerase chain reaction (PCR) is used in directed sequencing for the discovery of novel polymorphisms. As the first step in PCR directed sequencing, effective PCR primer design is crucial for obtaining high-quality sequence data for target regions. Since current computational primer design tools are not fully tuned with stable underlying laboratory protocols, researchers may still be forced to iteratively optimize protocols for failed amplifications after the primers have been ordered. Furthermore, potentially identifiable factors which contribute to PCR failures have yet to be elucidated. This inefficient approach to primer design is further intensified in a high-throughput laboratory, where hundreds of genes may be targeted in one experiment.ResultsWe have developed a fully integrated computational PCR primer design pipeline that plays a key role in our high-throughput directed sequencing pipeline. Investigators may specify target regions defined through a rich set of descriptors, such as Ensembl accessions and arbitrary genomic coordinates. Primer pairs are then selected computationally to produce a minimal amplicon set capable of tiling across the specified target regions. As part of the tiling process, primer pairs are computationally screened to meet the criteria for success with one of two PCR amplification protocols. In the process of improving our sequencing success rate, which currently exceeds 95% for exons, we have discovered novel and accurate computational methods capable of identifying primers that may lead to PCR failures. We reveal the laboratory protocols and their associated, empirically determined computational parameters, as well as describe the novel computational methods which may benefit others in future primer design research.ConclusionThe high-throughput PCR primer design pipeline has been very successful in providing the basis for high-quality directed sequencing results and for minimizing costs associated with labor and reprocessing. The modular architecture of the primer design software has made it possible to readily integrate additional primer critique tests based on iterative feedback from the laboratory. As a result, the primer design software, coupled with the laboratory protocols, serves as a powerful tool for low and high-throughput primer design to enable successful directed sequencing.

[1]  Gabriel Waksman,et al.  Crystal structures of open and closed forms of binary and ternary complexes of the large fragment of Thermus aquaticus DNA polymerase I: structural basis for nucleotide incorporation , 1998, The EMBO journal.

[2]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[3]  Adam Yao,et al.  PrimerZ: streamlined primer design for promoters, exons and human SNPs , 2007, Nucleic Acids Res..

[4]  G. Sarkar,et al.  The "megaprimer" method of site-directed mutagenesis. , 1990, BioTechniques.

[5]  Paul Scheet,et al.  Automating sequence-based detection and genotyping of SNPs from diploid samples , 2006, Nature Genetics.

[6]  Peter De Rijk,et al.  SNPbox: a modular software package for large-scale primer design , 2005, Bioinform..

[7]  B. Mccarthy,et al.  A general method for the isolation of RNA complementary to DNA. , 1962, Proceedings of the National Academy of Sciences of the United States of America.

[8]  S. Levy,et al.  Sequence survey of receptor tyrosine kinases reveals mutations in glioblastomas. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[9]  E. Birney,et al.  The Ensembl core software libraries. , 2004, Genome research.

[10]  M Vingron,et al.  Primer design for large scale sequencing. , 1998, Nucleic acids research.

[11]  S Rozen,et al.  Primer3 on the WWW for general users and for biologist programmers. , 2000, Methods in molecular biology.

[12]  G. Marth,et al.  Primer-site SNPs mask mutations , 2007, Nature Methods.

[13]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[14]  H. Allawi,et al.  Secondary structure prediction and structure-specific sequence analysis of single-stranded DNA. , 2001, Nucleic acids research.

[15]  V Gorelenkov,et al.  Set of novel tools for PCR primer design. , 2001, BioTechniques.

[16]  David J Munroe,et al.  EasyExonPrimer: automated primer design for exon sequences. , 2006, Applied bioinformatics.

[17]  C. Y. Lin,et al.  Primer Design Assistant (PDA): a web-based primer design tool , 2003, Nucleic Acids Res..

[18]  G. Parmigiani,et al.  The Consensus Coding Sequences of Human Breast and Colorectal Cancers , 2006, Science.

[19]  Fengzhu Sun,et al.  Taq DNA polymerase slippage mutation rates measured by PCR and quasi-likelihood analysis: (CA/GT)n and (A/T)n microsatellites. , 2003, Nucleic acids research.

[20]  Ruifang Zhang,et al.  MutScreener: primer design tool for PCR-direct sequencing , 2006, Nucleic Acids Res..

[21]  H. Blöcker,et al.  Predicting DNA duplex stability from the base sequence. , 1986, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Ivan Ovcharenko,et al.  ECR Browser: a tool for visualizing and accessing data from comparisons of multiple vertebrate genomes , 2004, Nucleic Acids Res..