Identification of pathogen genomic variants through an integrated pipeline

BackgroundWhole-genome sequencing represents a powerful experimental tool for pathogen research. We present methods for the analysis of small eukaryotic genomes, including a streamlined system (called Platypus) for finding single nucleotide and copy number variants as well as recombination events.ResultsWe have validated our pipeline using four sets of Plasmodium falciparum drug resistant data containing 26 clones from 3D7 and Dd2 background strains, identifying an average of 11 single nucleotide variants per clone. We also identify 8 copy number variants with contributions to resistance, and report for the first time that all analyzed amplification events are in tandem.ConclusionsThe Platypus pipeline provides malaria researchers with a powerful tool to analyze short read sequencing data. It provides an accurate way to detect SNVs using known software packages, and a novel methodology for detection of CNVs, though it does not currently support detection of small indels. We have validated that the pipeline detects known SNVs in a variety of samples while filtering out spurious data. We bundle the methods into a freely available package.

[1]  Nancy Fullman,et al.  Global malaria mortality between 1980 and 2010: a systematic analysis , 2012, The Lancet.

[2]  P. Newton,et al.  Adaptive Copy Number Evolution in Malaria Parasites , 2008, PLoS genetics.

[3]  Thomas Weise,et al.  Global Optimization Algorithms -- Theory and Application , 2009 .

[4]  Steven W. Smith,et al.  The Scientist and Engineer's Guide to Digital Signal Processing , 1997 .

[5]  References , 1971 .

[6]  Jonathan E. Allen,et al.  Genome sequence of the human malaria parasite Plasmodium falciparum , 2002, Nature.

[7]  Yingyao Zhou,et al.  A Systematic Map of Genetic Variation in Plasmodium falciparum , 2006 .

[8]  Kenny Q. Ye,et al.  Sensitive and accurate detection of copy number variants using read depth of coverage. , 2009, Genome research.

[9]  Pardis C Sabeti,et al.  A genome-wide map of diversity in Plasmodium falciparum , 2007, Nature Genetics.

[10]  A. Dash,et al.  The malaria parasite Plasmodium vivax exhibits greater genetic diversity than Plasmodium falciparum , 2012, Nature Genetics.

[11]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[12]  R. Rosenfeld Nature , 2009, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[13]  Misko Dzamba,et al.  Detecting copy number variation with mated short reads. , 2010, Genome research.

[14]  P. Rosenthal,et al.  Selection of Cysteine Protease Inhibitor-resistant Malaria Parasites Is Accompanied by Amplification of Falcipain Genes and Alteration in Inhibitor Transport* , 2004, Journal of Biological Chemistry.

[15]  John C. Tan,et al.  High-throughput 454 resequencing for allele discovery and recombination mapping in Plasmodium falciparum , 2011, BMC Genomics.

[16]  W. Trager,et al.  Human malaria parasites in continuous culture. , 1976, Science.

[17]  Eileen Kraemer,et al.  PlasmoDB: a functional genomic database for malaria parasites , 2008, Nucleic Acids Res..

[18]  BMC Bioinformatics , 2005 .

[19]  Gilean McVean,et al.  Multiple populations of artemisinin-resistant Plasmodium falciparum in Cambodia , 2013, Nature Genetics.

[20]  R. Wilson,et al.  BreakDancer: An algorithm for high resolution mapping of genomic structural variation , 2009, Nature Methods.

[21]  Serge Batalov,et al.  Use of high-density tiling microarrays to identify mutations globally and elucidate mechanisms of drug resistance in Plasmodium falciparum , 2009, Genome Biology.

[22]  R. Durbin,et al.  Mapping Quality Scores Mapping Short Dna Sequencing Reads and Calling Variants Using P

, 2022 .

[23]  J. Derisi,et al.  PRICE: Software for the Targeted Assembly of Components of (Meta) Genomic Sequence Data , 2013, G3: Genes, Genomes, Genetics.

[24]  Alejandro Llanos-Cuentas,et al.  Whole-genome sequencing and microarray analysis of ex vivo Plasmodium vivax reveal selective pressure on putative drug resistance genes , 2010, Proceedings of the National Academy of Sciences.

[25]  C. Wilson,et al.  Amplification of a gene related to mammalian mdr genes in drug-resistant Plasmodium falciparum. , 1989, Science.

[26]  G. McVean,et al.  Population Genomics of the Immune Evasion (var) Genes of Plasmodium falciparum , 2007, PLoS pathogens.

[27]  Samuel A. Assefa,et al.  Drug-Resistant Genotypes and Multi-Clonality in Plasmodium falciparum Analysed by Direct Genome Sequencing from Peripheral Blood of Malaria Patients , 2011, PloS one.

[28]  John A. Tallarico,et al.  Selective and Specific Inhibition of the Plasmodium falciparum Lysyl-tRNA Synthetase by the Fungal Secondary Metabolite Cladosporin , 2012, Cell host & microbe.

[29]  Peter G. Schultz,et al.  Meister Antimalarial Drug Discovery Liver Stages to Drive Next-Generation Plasmodium Imaging of , 2012 .

[30]  Terence P. Speed,et al.  Estimation and Correction for Gc-content Bias in High Throughput Sequencing , 2011 .

[31]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[32]  Bruce Russell,et al.  Spiroindolones, a Potent Compound Class for the Treatment of Malaria , 2010, Science.

[33]  Geoffrey L. Johnston,et al.  Mitotic Evolution of Plasmodium falciparum Shows a Stable Core Genome but Recombination in Antigen Families , 2013, PLoS genetics.

[34]  John C. Tan,et al.  Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing , 2012, Nature.