Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing

The application of next-generation sequencing to estimate genetic diversity of Plasmodium falciparum, the most lethal malaria parasite, has proved challenging due to the skewed AT-richness [∼80.6% (A + T)] of its genome and the lack of technology to assemble highly polymorphic subtelomeric regions that contain clonally variant, multigene virulence families (Ex: var and rifin). To address this, we performed amplification-free, single molecule, real-time sequencing of P. falciparum genomic DNA and generated reads of average length 12 kb, with 50% of the reads between 15.5 and 50 kb in length. Next, using the Hierarchical Genome Assembly Process, we assembled the P. falciparum genome de novo and successfully compiled all 14 nuclear chromosomes telomere-to-telomere. We also accurately resolved centromeres [∼90–99% (A + T)] and subtelomeric regions and identified large insertions and duplications that add extra var and rifin genes to the genome, along with smaller structural variants such as homopolymer tract expansions. Overall, we show that amplification-free, long-read sequencing combined with de novo assembly overcomes major challenges inherent to studying the P. falciparum genome. Indeed, this technology may not only identify the polymorphic and repetitive subtelomeric sequences of parasite populations from endemic areas but may also evaluate structural variation linked to virulence, drug resistance and disease transmission.

[1]  Michael C. Schatz,et al.  Assemblytics: a web analytics tool for the detection of assembly-based variants , 2016, bioRxiv.

[2]  Wang,et al.  DNA methylation on N 6 -adenine in mammalian embryonic stem cells , 2016 .

[3]  Evan E. Eichler,et al.  Genetic variation and the de novo assembly of human genomes , 2015, Nature Reviews Genetics.

[4]  Alfredo Tirado-Ramos,et al.  Multiplexed highly-accurate DNA sequencing of closely-related HIV-1 variants using continuous long reads from single molecule, real-time sequencing , 2015, Nucleic acids research.

[5]  L. Aravind,et al.  DNA Methylation on N6-Adenine in C. elegans , 2015, Cell.

[6]  Caleb F. Davis,et al.  Assessing structural variation in a personal genome—towards a human reference diploid genome , 2015, BMC Genomics.

[7]  G. von Heijne,et al.  RIFINs are adhesins implicated in severe Plasmodium falciparum malaria , 2015, Nature Medicine.

[8]  Gilean McVean,et al.  Genetic architecture of artemisinin-resistant Plasmodium falciparum , 2015, Nature Genetics.

[9]  D. Kwiatkowski Malaria genomics: tracking a diverse and evolving parasite population , 2015, International health.

[10]  Mark J. P. Chaisson,et al.  Resolving the complexity of the human genome using single-molecule sequencing , 2014, Nature.

[11]  D. Kwiatkowski,et al.  K13-Propeller Polymorphisms in Plasmodium falciparum Parasites From Sub-Saharan Africa , 2014, The Journal of infectious diseases.

[12]  Mihir Kekre,et al.  Generation of Antigenic Diversity in Plasmodium falciparum by Structured Rearrangement of Var Genes During Mitosis , 2014, PLoS genetics.

[13]  D. Kwiatkowski,et al.  Optimized Whole-Genome Amplification Strategy for Extremely AT-Biased Template , 2014, DNA research : an international journal for rapid publication of reports on genes and genomes.

[14]  D. Kwiatkowski,et al.  Monitoring parasite diversity for malaria elimination in sub-Saharan Africa , 2014, Science.

[15]  D. Serre,et al.  Single-cell genomics for dissection of complex malaria infections , 2014, Genome research.

[16]  Taane G. Clark,et al.  Genome-Wide Analysis of Selection on the Malaria Parasite Plasmodium falciparum in West African Populations of Differing Infection Endemicity , 2014, Molecular biology and evolution.

[17]  D. Ndiaye,et al.  An Adjustable Gas-Mixing Device to Increase Feasibility of In Vitro Culture of Plasmodium falciparum Parasites in the Field , 2014, PloS one.

[18]  B. Genton,et al.  A molecular marker of artemisinin-resistant Plasmodium falciparum malaria , 2013, Nature.

[19]  S. Lonardi,et al.  Genome-wide mapping of DNA methylation in the human malaria parasite Plasmodium falciparum. , 2013, Cell host & microbe.

[20]  L. Szekely,et al.  Improved In Vitro Culture of Plasmodium falciparum Permits Establishment of Clinical Isolates with Preserved Multiplication, Invasion and Rosetting Phenotypes , 2013, PloS one.

[21]  Aaron A. Klammer,et al.  Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data , 2013, Nature Methods.

[22]  Gilean McVean,et al.  Multiple populations of artemisinin-resistant Plasmodium falciparum in Cambodia , 2013, Nature Genetics.

[23]  Matthew K Waldor,et al.  Entering the era of bacterial epigenomics with single molecule real time DNA sequencing. , 2013, Current opinion in microbiology.

[24]  X. Su,et al.  Malaria biology and disease pathogenesis: insights for new treatments , 2013, Nature Medicine.

[25]  Geoffrey L. Johnston,et al.  Mitotic Evolution of Plasmodium falciparum Shows a Stable Core Genome but Recombination in Antigen Families , 2013, PLoS genetics.

[26]  T. Sixma,et al.  Protein engineering: making ubiquitin specific. , 2013, Nature chemical biology.

[27]  J. Carlton,et al.  Malaria parasites : comparative genomics, evolution and molecular biology , 2013 .

[28]  N. Lennon,et al.  Characterizing and measuring bias in sequence data , 2013, Genome Biology.

[29]  D. Kwiatkowski,et al.  Efficient Depletion of Host DNA Contamination in Malaria Clinical Sequencing , 2012, Journal of Clinical Microbiology.

[30]  Glenn Tesler,et al.  Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory , 2012, BMC Bioinformatics.

[31]  H. Stunnenberg,et al.  Plasmodium falciparum centromeres display a unique epigenetic makeup and cluster prior to and during schizogony , 2012, Cellular microbiology.

[32]  Mauricio O. Carneiro,et al.  Pacific biosciences sequencing technology for genotyping and variation discovery in human data , 2012, BMC Genomics.

[33]  H. Swerdlow,et al.  A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers , 2012, BMC Genomics.

[34]  John C. Tan,et al.  Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing , 2012, Nature.

[35]  K. Metzner,et al.  Challenges and opportunities in estimating viral genetic diversity from next-generation sequencing data , 2012, Front. Microbio..

[36]  S. Turner,et al.  Going beyond five bases in DNA sequencing. , 2012, Current opinion in structural biology.

[37]  M. Schatz,et al.  Hybrid error correction and de novo assembly of single-molecule sequencing reads , 2012, Nature Biotechnology.

[38]  D. Kwiatkowski,et al.  Optimizing illumina next-generation sequencing library preparation for extremely at-biased genomes , 2012, BMC Genomics.

[39]  John C. Tan,et al.  High-throughput 454 resequencing for allele discovery and recombination mapping in Plasmodium falciparum , 2011, BMC Genomics.

[40]  Michael A Quail,et al.  Optimal enzymes for amplifying sequencing libraries , 2011, Nature Methods.

[41]  Bruce Russell,et al.  Spiroindolones, a Potent Compound Class for the Treatment of Malaria , 2010, Science.

[42]  M. Kermekchiev,et al.  Direct DNA amplification from crude clinical samples using a PCR enhancer cocktail and novel mutants of Taq. , 2010, The Journal of molecular diagnostics : JMD.

[43]  Aaron R. Quinlan,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2022 .

[44]  S. Turner,et al.  Real-time DNA sequencing from single polymerase molecules. , 2010, Methods in enzymology.

[45]  Z. Ning,et al.  Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of GC-biased genomes , 2009, Nature Methods.

[46]  M. Kermekchiev,et al.  Mutants of Taq DNA polymerase resistant to PCR inhibitors allow DNA amplification from whole blood and crude soil samples , 2009, Nucleic acids research.

[47]  S. Turner,et al.  Real-Time DNA Sequencing from Single Polymerase Molecules , 2009, Science.

[48]  M. Wahlgren,et al.  METHODS IN MALARIA RESEARCH , 2008 .

[49]  X. Su,et al.  Genetic linkage and association analyses for trait mapping in Plasmodium falciparum , 2007, Nature Reviews Genetics.

[50]  Thomas Rattei,et al.  Gepard: a rapid and sensitive tool for creating dotplots on genome scale , 2007, Bioinform..

[51]  Pardis C Sabeti,et al.  A genome-wide map of diversity in Plasmodium falciparum , 2007, Nature Genetics.

[52]  L. McRobert,et al.  Evidence on the chromosomal location of centromeric DNA in Plasmodium falciparum from etoposide-mediated topoisomerase-II cleavage. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[53]  J. Lelièvre,et al.  An alternative method for Plasmodium culture synchronization. , 2005, Experimental parasitology.

[54]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[55]  Jonathan E. Allen,et al.  Genome sequence of the human malaria parasite Plasmodium falciparum , 2002, Nature.

[56]  J. Olivo-Marin,et al.  A central role for Plasmodium falciparum subtelomeric regions in spatial positioning and telomere length regulation , 2002, The EMBO journal.

[57]  Thomas E. Wellems,et al.  Frequent ectopic recombination of virulence factor genes in telomeric chromosome clusters of P. falciparum , 2000, Nature.

[58]  Eugene W. Myers,et al.  A whole-genome assembly of Drosophila. , 2000, Science.

[59]  S. Kyes,et al.  Rifins: a second family of clonally variant proteins expressed on the surface of red cells infected with Plasmodium falciparum. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[60]  X. Su,et al.  The large diverse gene family var encodes proteins involved in cytoadherence and antigenic variation of plasmodium falciparum-infected erythrocytes , 1995, Cell.

[61]  A. Akane,et al.  Identification of the heme compound copurified with deoxyribonucleic acid (DNA) from bloodstains, a major inhibitor of polymerase chain reaction (PCR) amplification. , 1994, Journal of forensic sciences.

[62]  A. Lambo Human malaria parasites , 1981 .

[63]  C. Lambros,et al.  Synchronization of Plasmodium falciparum erythrocytic stages in culture. , 1979, The Journal of parasitology.

[64]  W. Trager,et al.  Human malaria parasites in continuous culture. , 1976, Science.