PDA: Pooled DNA analyzer

BackgroundAssociation mapping using abundant single nucleotide polymorphisms is a powerful tool for identifying disease susceptibility genes for complex traits and exploring possible genetic diversity. Genotyping large numbers of SNPs individually is performed routinely but is cost prohibitive for large-scale genetic studies. DNA pooling is a reliable and cost-saving alternative genotyping method. However, no software has been developed for complete pooled-DNA analyses, including data standardization, allele frequency estimation, and single/multipoint DNA pooling association tests. This motivated the development of the software, 'PDA' (Pooled DNA Analyzer), to analyze pooled DNA data.ResultsWe develop the software, PDA, for the analysis of pooled-DNA data. PDA is originally implemented with the MATLAB® language, but it can also be executed on a Windows system without installing the MATLAB®. PDA provides estimates of the coefficient of preferential amplification and allele frequency. PDA considers an extended single-point association test, which can compare allele frequencies between two DNA pools constructed under different experimental conditions. Moreover, PDA also provides novel chromosome-wide multipoint association tests based on p-value combinations and a sliding-window concept. This new multipoint testing procedure overcomes a computational bottleneck of conventional haplotype-oriented multipoint methods in DNA pooling analyses and can handle data sets having a large pool size and/or large numbers of polymorphic markers. All of the PDA functions are illustrated in the four bona fide examples.ConclusionPDA is simple to operate and does not require that users have a strong statistical background. The software is available at http://www.ibms.sinica.edu.tw/%7Ecsjfann/first%20flow/pda.htm.

[1]  D. Clayton,et al.  Genome-wide association studies: theoretical and practical concerns , 2005, Nature Reviews Genetics.

[2]  Toshikazu Ito,et al.  Estimation of haplotype frequencies, linkage-disequilibrium measures, and combination of haplotype copies in each pool by use of pooled DNA data. , 2003, American journal of human genetics.

[3]  D. Cox,et al.  Application of pooled genotyping to scan candidate regions for association with HDL cholesterol levels , 2004, Human Genomics.

[4]  B Müller-Myhsok,et al.  Rapid simulation of P values for product methods and multiple-testing adjustment in association studies. , 2005, American journal of human genetics.

[5]  Z. Meng,et al.  Selection of genetic markers for association analyses, using linkage disequilibrium and haplotypes. , 2003, American journal of human genetics.

[6]  C. Fann,et al.  A Comparison of Individual Genotyping and Pooled DNA Analysis for Polymorphism Validation Prior to Large‐Scale Genetic Studies , 2006, Annals of human genetics.

[7]  G. Zheng Use of max and min scores for trend tests for association when the genetic model is unknown , 2003, Statistics in medicine.

[8]  Jochen Hampe,et al.  High-resolution SNP scan of chromosome 6p21 in pooled samples from patients with complex diseases. , 2003, Genomics.

[9]  M. Nelson,et al.  Large-scale validation of single nucleotide polymorphisms in gene regions. , 2004, Genome research.

[10]  R. Strausberg,et al.  High-throughput development and characterization of a genomewide collection of gene-based single nucleotide polymorphism markers by chip-based matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[11]  Panos Deloukas,et al.  SNP allele frequency estimation in DNA pools and variance components analysis. , 2004, BioTechniques.

[12]  M. Daly,et al.  Genome-wide association studies for common diseases and complex traits , 2005, Nature Reviews Genetics.

[13]  J. Ott,et al.  Efficiency of single-nucleotide polymorphism haplotype estimation from pooled DNA , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Chia-Ching Pan,et al.  New Adjustment Factors and Sample Size Calculation in a DNA-Pooling Experiment With Preferential Amplification , 2005, Genetics.

[15]  Hongyu Zhao,et al.  On the use of DNA pooling to estimate haplotype frequencies , 2003, Genetic epidemiology.

[16]  Hsin-Chou Yang,et al.  Association mapping using pooled DNA. , 2007, Methods in molecular biology.

[17]  D. Clayton,et al.  Identification of the sources of error in allele frequency estimations from pooled DNA indicates an optimal experimental design. , 2002, Annals of human genetics.

[18]  B S Weir,et al.  Truncated product method for combining P‐values , 2002, Genetic epidemiology.

[19]  N. Arnheim,et al.  Use of pooled DNA samples to detect linkage disequilibrium of polymorphic restriction fragments and human disease: studies of the HLA class II loci. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Laura J. Scott,et al.  High-throughput screening for evidence of association by using mass spectrometry genotyping on DNA pools , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[21]  N. Kaplan,et al.  On the advantage of haplotype analysis in the presence of multiple disease susceptibility alleles , 2002, Genetic epidemiology.

[22]  M. O’Donovan,et al.  DNA Pooling: a tool for large-scale association studies , 2002, Nature Reviews Genetics.

[23]  P. Visscher,et al.  Simple method to analyze SNP‐based association studies using DNA pools , 2003, Genetic epidemiology.

[24]  Frank Dudbridge,et al.  Rank truncated product of P‐values, with application to genomewide association scans , 2003, Genetic epidemiology.

[25]  Claire L. Simpson,et al.  A central resource for accurate allele frequency estimation from pooled DNA genotyped on DNA microarrays , 2005, Nucleic acids research.

[26]  Eugene S. Edgington,et al.  An Additive Method for Combining Probability Values from Independent Experiments , 1972 .

[27]  C Charles Gu,et al.  Genetic association mapping under founder heterogeneity via weighted haplotype similarity analysis in candidate genes , 2004, Genetic epidemiology.

[28]  D. Zeng,et al.  Estimating haplotype‐disease associations with pooled genotype data , 2005, Genetic epidemiology.

[29]  Michael Owen,et al.  Cheap, accurate and rapid allele frequency estimation of single nucleotide polymorphisms by primer extension and DHPLC in DNA pools , 2000, Human Genetics.