Inferring relative proportions of DNA variants from sequencing electropherograms

MOTIVATION Determination of the relative copy number of single-nucleotide sequence variants (SNVs) within a DNA sample is a frequent experimental goal. Various methods can be applied to this problem, although hybridization-based approaches tend to suffer from high-setup cost and poor adaptability, while others (such as pyrosequencing) may not be accessible to all laboratories. The potential to extract relative copy number information from standard dye-terminator electropherograms has been little explored, yet this technology is cheap and widely accessible. Since several biologically important loci have paralogous copies that interfere with genotyping, and which may also display copy number variation (CNV), there are many situations in which determination of the relative copy number of SNVs is desirable. RESULTS We have developed a desktop application, QSVanalyzer, which allows high-throughput quantification of the proportions of DNA sequences containing SNVs. In reconstruction experiments, QSVanalyzer accurately estimated the known relative proportions of SNVs. By analyzing a large panel of genomic DNA samples, we demonstrate the ability of the software to analyze not only common biallelic SNVs, but also SNVs within a locus at which gene conversion between four genomic paralogs operates, and within another that is subject to CNV. AVAILABILITY AND IMPLEMENTATION QSVanalyzer is freely available at http://dna.leeds.ac.uk/qsv/. It requires the Microsoft .NET framework version 2.0, which can be installed on all Microsoft operating systems from Windows 98 onwards. CONTACT msjimc@leeds.ac.uk SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Kenneth G. C. Smith,et al.  Copy number of FCGR3B, which is associated with systemic lupus erythematosus, correlates with protein expression and immune complex uptake , 2008, The Journal of experimental medicine.

[2]  E. Eichler,et al.  Segmental duplications and copy-number variation in the human genome. , 2005, American journal of human genetics.

[3]  S. Mccarroll Copy-number analysis goes more than skin deep , 2008, Nature Genetics.

[4]  D. Bonthron,et al.  Novel PMS2 pseudogenes can conceal recessive mutations causing a distinctive childhood cancer syndrome. , 2004, American journal of human genetics.

[5]  André Reis,et al.  Psoriasis is associated with increased beta-defensin genomic copy number. , 2008, Nature genetics.

[6]  Bi Zhou,et al.  Gene copy-number variation and associated polymorphisms of complement component C4 in human systemic lupus erythematosus (SLE): low copy number is a risk factor for and high copy number is a protective factor against SLE susceptibility in European Americans. , 2007, American journal of human genetics.

[7]  A. Wedell Molecular genetics of congenital adrenal hyperplasia (21-hydroxylase deficiency): implications for diagnosis, prognosis and treatment. , 1998, Acta paediatrica.

[8]  Á. Carracedo,et al.  Rapid real-time fluorescent PCR gene dosage test for the diagnosis of DNA duplications and deletions. , 2000, Clinical chemistry.

[9]  Anthony J Brookes,et al.  Complex SNP-related sequence variation in segmental genome duplications , 2004, Nature Genetics.

[10]  D. Zwijnenburg,et al.  Relative quantification of 40 nucleic acid sequences by multiplex ligation-dependent probe amplification. , 2002, Nucleic acids research.

[11]  R. Andrews,et al.  Exon array CGH: detection of copy-number changes at the resolution of individual exons in the human genome. , 2005, American journal of human genetics.

[12]  M. den Heijer,et al.  Accurate, high-throughput typing of copy number variation using paralogue ratios from dispersed repeats , 2006, Nucleic acids research.

[13]  L. Andersson,et al.  A sensitive method for detecting variation in copy numbers of duplicated genes. , 2003, Genome research.

[14]  Matthias Wjst,et al.  Large‐scale determination of SNP allele frequencies in DNA pools using MALDI‐TOF mass spectrometry , 2002, Human mutation.

[15]  Andrew M. Jenkinson,et al.  AutoCSA, an algorithm for high throughput DNA sequence variant detection in cancer genomes , 2007, Bioinform..

[16]  Enrico Petretto,et al.  Copy number polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans , 2006, Nature.

[17]  D. Bonthron,et al.  Extensive gene conversion at the PMS2 DNA mismatch repair locus , 2007, Human mutation.

[18]  M. Schalling,et al.  Pyrosequencing™‐based SNP allele frequency estimation in DNA pools , 2004, Human mutation.

[19]  M Bobrow,et al.  Comparative sequence analysis (CSA): A new sequence‐based method for the identification and characterization of mutations in DNA , 2000, Human mutation.

[20]  Bernhard Radlwimmer,et al.  A chromosome 8 gene-cluster polymorphism with low human beta-defensin 2 gene copy number predisposes to Crohn disease of the colon. , 2006, American journal of human genetics.

[21]  G. Waksman,et al.  Crystal structures of a ddATP‐, ddTTP‐, ddCTP, and ddGTP‐ trapped ternary complex of Klentaq1: Insights into nucleotide incorporation and selectivity , 2001, Protein science : a publication of the Protein Society.

[22]  G. Kirov,et al.  Universal, robust, highly quantitative SNP allele frequency measurement in DNA pools , 2002, Human Genetics.

[23]  References , 1971 .

[24]  C. Sismani,et al.  Measurement of locus copy number by hybridisation with amplifiable probes. , 2000, Nucleic acids research.

[25]  S. Mccarroll,et al.  Copy-number variation and association studies of human disease , 2007, Nature Genetics.

[26]  G. Waksman,et al.  Structure-based design of Taq DNA polymerases with improved properties of dideoxynucleotide incorporation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.