Current Strategies of Polyploid Plant Genome Sequence Assembly

Polyploidy or duplication of an entire genome occurs in the majority of angiosperms. The understanding of polyploid genomes is important for the improvement of those crops, which humans rely on for sustenance and basic nutrition. As climate change continues to pose a potential threat to agricultural production, there will increasingly be a demand for plant cultivars that can resist biotic and abiotic stresses and also provide needed and improved nutrition. In the past decade, Next Generation Sequencing (NGS) has fundamentally changed the genomics landscape by providing tools for the exploration of polyploid genomes. Here, we review the challenges of the assembly of polyploid plant genomes, and also present recent advances in genomic resources and functional tools in molecular genetics and breeding. As genomes of diploid and less heterozygous progenitor species are increasingly available, we discuss the lack of complexity of these currently available reference genomes as they relate to polyploid crops. Finally, we review recent approaches of haplotyping by phasing and the impact of third generation technologies on polyploid plant genome assembly.

[1]  F. Sanger,et al.  DNA sequencing with chain-terminating inhibitors. , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[2]  Henry D. Priest,et al.  The genome of woodland strawberry (Fragaria vesca) , 2011, Nature Genetics.

[3]  Zhongyuan Hu,et al.  The genome sequence of allopolyploid Brassica juncea and analysis of differential homoeolog gene expression influencing selection , 2016, Nature Genetics.

[4]  C. Butts,et al.  Novel proteases from the genome of the carnivorous plant Drosera capensis: Structural prediction and comparative analysis , 2016, Proteins.

[5]  Detlef Weigel,et al.  High contiguity Arabidopsis thaliana genome assembly with a single nanopore flow cell , 2018, Nature Communications.

[6]  A. D'Hont,et al.  Unraveling the genome structure of polyploids using FISH and GISH; examples of sugarcane and banana , 2005, Cytogenetic and Genome Research.

[7]  Korbinian Schneeberger,et al.  The impact of third generation genomic technologies on plant genome assembly. , 2017, Current opinion in plant biology.

[8]  H. Matsumura,et al.  Draft genome sequence of bitter gourd (Momordica charantia), a vegetable and medicinal plant in tropical and subtropical regions , 2016, DNA research : an international journal for rapid publication of reports on genes and genomes.

[9]  Bonnie Berger,et al.  HapTree: A Novel Bayesian Framework for Single Individual Polyplotyping Using NGS Data , 2014, PLoS Comput. Biol..

[10]  F. Sanger,et al.  A rapid method for determining sequences in DNA by primed synthesis with DNA polymerase. , 1975, Journal of molecular biology.

[11]  Alexey A. Gurevich,et al.  QUAST: quality assessment tool for genome assemblies , 2013, Bioinform..

[12]  Victor Guryev,et al.  Dense and accurate whole-chromosome haplotyping of individual genomes , 2017, Nature Communications.

[13]  Jonathan D. G. Jones,et al.  Shifting the limits in wheat research and breeding using a fully annotated reference genome , 2018, Science.

[14]  Sorin Istrail,et al.  HapCompass: A Fast Cycle Basis Algorithm for Accurate Haplotype Assembly of Sequence Data , 2012, J. Comput. Biol..

[15]  David Heckerman,et al.  ConPADE: Genome Assembly Ploidy Estimation from Next-Generation Sequencing Data , 2015, PLoS Comput. Biol..

[16]  Danica T. Harbaugh Polyploid and Hybrid Origins of Pacific Island Sandalwoods (Santalum, Santalaceae) Inferred from Low‐Copy Nuclear and Flow Cytometry Data , 2008, International Journal of Plant Sciences.

[17]  Corinne Da Silva,et al.  Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome , 2014, Science.

[18]  Hideki Hirakawa,et al.  Sequencing and comparative analyses of the genomes of zoysiagrasses , 2016, DNA research : an international journal for rapid publication of reports on genes and genomes.

[19]  Richard M. Clark,et al.  The Arabidopsis lyrata genome sequence and the basis of rapid genome size change , 2011, Nature Genetics.

[20]  Richard Finkers,et al.  Exploiting next-generation sequencing to solve the haplotyping puzzle in polyploids: a simulation study , 2017, Briefings Bioinform..

[21]  G. Darrow The strawberry : history, breeding and physiology , 1966 .

[22]  Xun Xu,et al.  Genome sequence of the cultivated cotton Gossypium arboreum , 2014, Nature Genetics.

[23]  B. Mishra,et al.  Comparing De Novo Genome Assembly: The Long and Short of It , 2011, PloS one.

[24]  Hikmet Budak,et al.  Megabase Level Sequencing Reveals Contrasted Organization and Evolution Patterns of the Wheat Gene and Transposable Element Spaces[W] , 2010, Plant Cell.

[25]  E. D. Hyman A new method of sequencing DNA. , 1988, Analytical biochemistry.

[26]  Peter J. Bradbury,et al.  High-resolution genetic mapping of maize pan-genome sequence anchors , 2015, Nature Communications.

[27]  R. Schlapbach,et al.  Multiple hybrid de novo genome assembly of finger millet, an orphan allotetraploid crop , 2017, DNA research : an international journal for rapid publication of reports on genes and genomes.

[28]  B. Gaut,et al.  Evolutionary genomics of grape (Vitis vinifera ssp. vinifera) domestication , 2017, Proceedings of the National Academy of Sciences.

[29]  M. Yıldız,et al.  Sugar beet (Beta vulgaris L.) growth at different ploidy levels , 2013 .

[30]  Hanlee P. Ji,et al.  Chromosome-scale mega-haplotypes enable digital karyotyping of cancer aneuploidy , 2017, Nucleic acids research.

[31]  P. Pevzner,et al.  An Eulerian path approach to DNA fragment assembly , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[32]  E. Schijlen,et al.  A footprint of desiccation tolerance in the genome of Xerophyta viscosa , 2017, Nature Plants.

[33]  Bonnie Berger,et al.  HapTree: A Novel Bayesian Framework for Single Individual Polyplotyping Using NGS Data , 2014, RECOMB.

[34]  M. Peitsch,et al.  The tobacco genome sequence and its comparison with those of tomato and potato , 2014, Nature Communications.

[35]  A. Hufton,et al.  Polyploidy and genome restructuring: a variety of outcomes. , 2009, Current opinion in genetics & development.

[36]  W. Gilbert,et al.  A new method for sequencing DNA. , 1977, Proceedings of the National Academy of Sciences of the United States of America.

[37]  Dawn H. Nagel,et al.  The B73 Maize Genome: Complexity, Diversity, and Dynamics , 2009, Science.

[38]  Huanming Yang,et al.  Genome of wild olive and the evolution of oil biosynthesis , 2017, Proceedings of the National Academy of Sciences.

[39]  Ryan W. Kim,et al.  Genome analysis of Hibiscus syriacus provides insights of polyploidization and indeterminate flowering in woody plants , 2016, DNA research : an international journal for rapid publication of reports on genes and genomes.

[40]  Meng Xu,et al.  Whole genome sequencing of a banana wild relative Musa itinerans provides insights into lineage-specific diversification of the Musa genus , 2016, Scientific Reports.

[41]  U. Treier,et al.  Polyploidy in the olive complex (Olea europaea): evidence from flow cytometry and nuclear microsatellite analyses. , 2008, Annals of botany.

[42]  C. K. Chan,et al.  The pangenome of hexaploid bread wheat , 2017, The Plant journal : for cell and molecular biology.

[43]  He Zhang,et al.  Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution , 2015, Nature Biotechnology.

[44]  Martin Vingron,et al.  Haplotype-resolved sweet potato genome traces back its hexaploidization history , 2017, Nature Plants.

[45]  Diego Mauricio Riaño-Pachón,et al.  ploidyNGS: Visually exploring ploidy with Next Generation Sequencing data , 2016, bioRxiv.

[46]  M. Bento,et al.  Size matters in Triticeae polyploids: larger genomes have higher remodeling. , 2011, Genome.

[47]  Sergey Koren,et al.  Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii , a progenitor of bread wheat , with the mega-reads algorithm , 2016 .

[48]  Wei Tang,et al.  Draft genome of the kiwifruit Actinidia chinensis , 2013, Nature Communications.

[49]  Wei Huang,et al.  The genome sequences of Arachis duranensis and Arachis ipaensis, the diploid ancestors of cultivated peanut , 2016, Nature Genetics.

[50]  Lauren Ancel Meyers,et al.  ON THE ABUNDANCE OF POLYPLOIDS IN FLOWERING PLANTS , 2006, Evolution; international journal of organic evolution.

[51]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..

[52]  Eric S. Lander,et al.  Hi-C: A Method to Study the Three-dimensional Architecture of Genomes. , 2010, Journal of visualized experiments : JoVE.

[53]  A. Rathore,et al.  Achievements and prospects of genomics-assisted breeding in three legume crops of the semi-arid tropics. , 2013, Biotechnology advances.

[54]  Zuhong Lu,et al.  Recent Advances in Experimental Whole Genome Haplotyping Methods , 2017, International journal of molecular sciences.

[55]  L. Borgen,et al.  Ploidal levels in the arctic-alpine polyploid Draba lactea (Brassicaceae) and its low-ploid relatives , 2005 .

[56]  Eugene W. Myers,et al.  The greedy path-merging algorithm for contig scaffolding , 2002, JACM.

[57]  Sorin Istrail,et al.  Haplotype assembly in polyploid genomes and identical by descent shared tracts , 2013, Bioinform..

[58]  Sergey Koren,et al.  De Novo Assembly of a New Solanum pennellii Accession Using Nanopore Sequencing[CC-BY] , 2017, Plant Cell.

[59]  D. Schemske,et al.  PATHWAYS, MECHANISMS, AND RATES OF POLYPLOID FORMATION IN FLOWERING PLANTS , 1998 .

[60]  Deanna M. Church,et al.  Reference quality assembly of the 3.5-Gb genome of Capsicum annuum from a single linked-read library , 2017, Horticulture Research.

[61]  Rikky W. Purbojati,et al.  Correction for Lan et al., Long-read sequencing uncovers the adaptive topography of a carnivorous plant genome , 2017, Proceedings of the National Academy of Sciences.

[62]  Sophien Kamoun,et al.  nQuire: A Statistical Framework For Ploidy Estimation Using Next Generation Sequencing , 2017 .

[63]  C. R. Carvalho,et al.  Recovering polyploid papaya in vitro regenerants as screened by flow cytometry , 2008, Plant Cell, Tissue and Organ Culture.

[64]  Shailaja Hittalmani,et al.  Genome and Transcriptome sequence of Finger millet (Eleusine coracana (L.) Gaertn.) provides insights into drought tolerance and nutraceutical properties , 2017, BMC Genomics.

[65]  Z. Chen,et al.  Molecular mechanisms of polyploidy and hybrid vigor. , 2010, Trends in plant science.

[66]  A. Furtado,et al.  Long-read sequencing of the coffee bean transcriptome reveals the diversity of full-length transcripts , 2017, GigaScience.

[67]  Zheng Li,et al.  BAUM: improving genome assembly by adaptive unique mapping and local overlap‐layout‐consensus approach , 2018, Bioinform..

[68]  W. Cowling,et al.  An efficient high‐throughput flow cytometric method for estimating DNA ploidy level in plants , 2009, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[69]  D. Riaño-Pachón,et al.  Draft genome sequencing of the sugarcane hybrid SP80-3280 , 2017, F1000Research.

[70]  S. Kuhara,et al.  Dissection of the Octoploid Strawberry Genome by Deep Sequencing of the Genomes of Fragaria Species , 2013, DNA research : an international journal for rapid publication of reports on genes and genomes.

[71]  Vineet Bafna,et al.  HapCUT: an efficient and accurate algorithm for the haplotype assembly problem , 2008, ECCB.

[72]  Henning Redestig,et al.  Homoeologs: What Are They and How Do We Infer Them? , 2016, Trends in plant science.

[73]  T. Sakurai,et al.  Genome sequence of the palaeopolyploid soybean , 2010, Nature.

[74]  J. Batley,et al.  A chromosome-based draft sequence of the hexaploid bread wheat (Triticum aestivum) genome , 2014, Science.

[75]  M. Peitsch,et al.  Reference genomes and transcriptomes of Nicotiana sylvestris and Nicotiana tomentosiformis , 2013, Genome Biology.

[76]  T. Sharma,et al.  First de novo draft genome sequence of Oryza coarctata, the only halophytic species in the genus Oryza , 2017, F1000Research.

[77]  G. Jung,et al.  Determination of the Level of Variation in Polyploidy among Kentucky Bluegrass Cultivars by Means of Flow Cytometry , 2004 .

[78]  Andrew G. Sharpe,et al.  The emerging biofuel crop Camelina sativa retains a highly undifferentiated hexaploid genome structure , 2014, Nature Communications.

[79]  David M. A. Martin,et al.  Genome sequence and analysis of the tuber crop potato , 2011, Nature.

[80]  A. Kasarskis,et al.  A window into third-generation sequencing. , 2010, Human molecular genetics.

[81]  Vineet Bafna,et al.  HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies , 2017, Genome research.

[82]  Mihai Pop,et al.  Genome assembly reborn: recent computational challenges , 2009, Briefings Bioinform..

[83]  H. Vikalo,et al.  SDhaP: haplotype assembly for diploids and polyploids via semi-definite programming , 2015, BMC Genomics.

[84]  Alexander Goesmann,et al.  The genome of the recently domesticated crop plant sugar beet (Beta vulgaris) , 2013, Nature.

[85]  Fritz J Sedlazeck,et al.  Piercing the dark matter: bioinformatics of long-range sequencing and mapping , 2018, Nature Reviews Genetics.

[86]  Ute Roessner,et al.  The genome of Chenopodium quinoa , 2017, Nature.

[87]  Shane A. McCarthy,et al.  Reference-based phasing using the Haplotype Reference Consortium panel , 2016, Nature Genetics.

[88]  Z. Yousaf,et al.  Karyological analysis of bitter gourd (Momordica charantia L., Cucurbitaceae) from Southeast Asian countries , 2014, Plant Genetic Resources: Characterization and Utilization.

[89]  T. Michael,et al.  Progress, challenges and the future of crop genomes. , 2015, Current opinion in plant biology.

[90]  L. Farinelli,et al.  Genome and transcriptome sequencing identifies breeding targets in the orphan crop tef (Eragrostis tef) , 2014, BMC Genomics.

[91]  G. Wagner,et al.  Proceedings of the SMBE Tri-National Young Investigators' Workshop 2005. What is the role of genome duplication in the evolution of complexity and diversity? , 2006, Molecular biology and evolution.

[92]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[93]  Takuji Sasaki,et al.  The map-based sequence of the rice genome , 2005, Nature.

[94]  E. Mardis Next-generation DNA sequencing methods. , 2008, Annual review of genomics and human genetics.

[95]  Pamela S Soltis,et al.  Polyploidy: Pitfalls and paths to a paradigm. , 2016, American journal of botany.

[96]  Kentaro K. Shimizu,et al.  Reference-guided de novo assembly approach improves genome reconstruction for related species , 2017, BMC Bioinformatics.

[97]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[98]  Ryan A. Rapp,et al.  Evolutionary genetics of genome merger and doubling in plants. , 2008, Annual review of genetics.

[99]  Stephen M. Mount,et al.  The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus) , 2008, Nature.

[100]  Darío Guerrero-Fernández,et al.  Why Assembling Plant Genome Sequences Is So Challenging , 2012, Biology.

[101]  Lindsey J. Leach,et al.  HANDS: a tool for genome-wide discovery of subgenome-specific base-identity in polyploids , 2013, BMC Genomics.

[102]  Marco Thines,et al.  The rise and fall of the Phytophthora infestans lineage that triggered the Irish potato famine , 2013, eLife.

[103]  J. Dubcovsky,et al.  Wheat TILLING Mutants Show That the Vernalization Gene VRN1 Down-Regulates the Flowering Repressor VRN2 in Leaves but Is Not Essential for Flowering , 2012, PLoS genetics.

[104]  M. Liu,et al.  The Genome of Artemisia annua Provides Insight into the Evolution of Asteraceae Family and Artemisinin Biosynthesis. , 2018, Molecular plant.

[105]  P. Ozias‐Akins,et al.  SWEEP: A Tool for Filtering High-Quality SNPs in Polyploid Crops , 2015, G3: Genes, Genomes, Genetics.

[106]  M. Calus,et al.  Long-term response to genomic selection: effects of estimation method and reference population structure for different genetic architectures , 2012, Genetics Selection Evolution.

[107]  W. V. Baird,et al.  Ploidy Levels, Relative Genome Sizes, and Base Pair Composition in Magnolia , 2010 .

[108]  F. Sanger,et al.  Nucleotide sequence of bacteriophage φX174 DNA , 1977, Nature.

[109]  B. Mable,et al.  Characterizing polyploidy in Arabidopsis lyrata using chromosome counts and flow cytometry , 2004 .

[110]  Sergey Koren,et al.  Extended haplotype phasing of de novo genome assemblies with FALCON-Phase , 2019 .

[111]  Ying Zhang,et al.  Computational pan-genomics: status, promises and challenges , 2016, bioRxiv.

[112]  G. Quintero-Arias,et al.  Strawberry , 2020, Temperate Fruits.

[113]  J. Silva,et al.  Karyotype analysis of Santalum album L. , 2010 .

[114]  Cristobal Uauy,et al.  Genomic innovation for crop improvement , 2017, Nature.

[115]  Y. Kohara,et al.  Sequence Analysis of the Genome of an Oil-Bearing Tree, Jatropha curcas L. , 2010, DNA research : an international journal for rapid publication of reports on genes and genomes.

[116]  T. Tschaplinski,et al.  Genomic aspects of research involving polyploid plants , 2010, Plant Cell, Tissue and Organ Culture (PCTOC).

[117]  K. Rothfels,et al.  Chromosome size and DNA values in sundews (Droseraceae) , 1968, Chromosoma.