Genome Reduction Uncovers a Large Dispensable Genome and Adaptive Role for Copy Number Variation in Asexually Propagated Solanum tuberosum[OPEN]

Asexually propagated potato shows greater copy number variation compared with sexually propagated plant species, with a strong connection to environmental response pathways. Clonally reproducing plants have the potential to bear a significantly greater mutational load than sexually reproducing species. To investigate this possibility, we examined the breadth of genome-wide structural variation in a panel of monoploid/doubled monoploid clones generated from native populations of diploid potato (Solanum tuberosum), a highly heterozygous asexually propagated plant. As rare instances of purely homozygous clones, they provided an ideal set for determining the degree of structural variation tolerated by this species and deriving its minimal gene complement. Extensive copy number variation (CNV) was uncovered, impacting 219.8 Mb (30.2%) of the potato genome with nearly 30% of genes subject to at least partial duplication or deletion, revealing the highly heterogeneous nature of the potato genome. Dispensable genes (>7000) were associated with limited transcription and/or a recent evolutionary history, with lower deletion frequency observed in genes conserved across angiosperms. Association of CNV with plant adaptation was highlighted by enrichment in gene clusters encoding functions for environmental stress response, with gene duplication playing a part in species-specific expansions of stress-related gene families. This study revealed unique impacts of CNV in a species with asexual reproductive habits and how CNV may drive adaption through evolution of key stress pathways.

[1]  Yuehua Cui,et al.  Genetic Map and QTL Analysis of Agronomic Traits in a Diploid Potato Population using Single Nucleotide Polymorphism Markers , 2015 .

[2]  Shuai Li,et al.  Genome-Wide Mapping of Structural Variations Reveals a Copy Number Variant That Determines Reproductive Morphology in Cucumber , 2015, Plant Cell.

[3]  Peter J. Bradbury,et al.  High-resolution genetic mapping of maize pan-genome sequence anchors , 2015, Nature Communications.

[4]  M. Hardigan,et al.  Taxonomy and Genetic Differentiation among Wild and Cultivated Germplasm of Solanum sect. Petota , 2015, The plant genome.

[5]  J. R. MacDonald,et al.  A copy number variation map of the human genome , 2015, Nature Reviews Genetics.

[6]  D. Moazed,et al.  RNA-mediated epigenetic regulation of gene expression , 2015, Nature Reviews Genetics.

[7]  Ruiqiang Li,et al.  De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits , 2014, Nature Biotechnology.

[8]  M. A. Pedraza,et al.  Insights into the Maize Pan-Genome and Pan-Transcriptome[W][OPEN] , 2014, Plant Cell.

[9]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[10]  David M. A. Martin,et al.  Construction of Reference Chromosome-Scale Pseudomolecules for Potato: Integrating the Potato Genome with Genetic and Physical Maps , 2013, G3: Genes, Genomes, Genetics.

[11]  M. Figlerowicz,et al.  Copy number polymorphism in plant genomes , 2013, Theoretical and Applied Genetics.

[12]  A. Aharoni,et al.  Biosynthesis of Antinutritional Alkaloids in Solanaceous Crops Is Mediated by Clustered Genes , 2013, Science.

[13]  Jiming Jiang,et al.  Copy number variation in potato - an asexually propagated autotetraploid species. , 2013, The Plant journal : for cell and molecular biology.

[14]  R. Veilleux,et al.  Retrospective View of North American Potato (Solanum tuberosum L.) Breeding in the 20th and 21st Centuries , 2013, G3: Genes, Genomes, Genetics.

[15]  R. Visser,et al.  Correction: A Next-Generation Sequencing Method for Genotyping-by-Sequencing of Highly Heterozygous Autotetraploid Potato , 2015, PloS one.

[16]  D. Spooner,et al.  Genetic diversity and origin of cultivated potatoes based on plastid microsatellite polymorphism , 2013, Genetic Resources and Crop Evolution.

[17]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[18]  Rod A Wing,et al.  Aluminum tolerance in maize is associated with higher MATE1 gene copy number , 2013, Proceedings of the National Academy of Sciences.

[19]  Jan O. Korbel,et al.  Phenotypic impact of genomic structural variation: insights from and for human disease , 2013, Nature Reviews Genetics.

[20]  D. K. Willis,et al.  Copy Number Variation of Multiple Genes at Rhg1 Mediates Nematode Resistance in Soybean , 2012, Science.

[21]  Jian Wu,et al.  Genome-wide analysis of SAUR gene family in Solanaceae species. , 2012, Gene.

[22]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[23]  D. Ding,et al.  MicroRNA Transcriptomic Analysis of Heterosis during Maize Seed Germination , 2012, PloS one.

[24]  Peter J. Bradbury,et al.  Maize HapMap2 identifies extant variation from a genome in flux , 2012, Nature Genetics.

[25]  T. E. Wilson,et al.  Replication stress and mechanisms of CNV formation. , 2012, Current opinion in genetics & development.

[26]  R. Veilleux,et al.  Integration of Two Diploid Potato Linkage Maps with the Potato Genome Sequence , 2012, PloS one.

[27]  Daniel W. A. Buchan,et al.  The tomato genome sequence provides insights into fleshy fruit evolution , 2012, Nature.

[28]  D. Laurie,et al.  Copy Number Variation Affecting the Photoperiod-B1 and Vernalization-A1 Genes Is Associated with Altered Flowering Time in Wheat (Triticum aestivum) , 2012, PloS one.

[29]  C. Robin Buell,et al.  Maize (Zea mays L.) Genome Diversity as Revealed by RNA-Sequencing , 2012, PloS one.

[30]  Hong Ma,et al.  Analysis of Arabidopsis genome-wide variations before and after meiosis and meiotic recombination by resequencing Landsberg erecta and all four products of a single meiosis. , 2012, Genome research.

[31]  Edward S. Buckler,et al.  Crop genomics: advances and applications , 2011, Nature Reviews Genetics.

[32]  Lin Fang,et al.  Resequencing 50 accessions of cultivated and wild rice yields markers for identifying agronomically important genes , 2011, Nature Biotechnology.

[33]  Karsten M. Borgwardt,et al.  Whole-genome sequencing of multiple Arabidopsis thaliana populations , 2011, Nature Genetics.

[34]  David M. A. Martin,et al.  Genome sequence and analysis of the tuber crop potato , 2011, Nature.

[35]  N. Friedman,et al.  Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2011, Nature Biotechnology.

[36]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[37]  M. Gerstein,et al.  CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. , 2011, Genome research.

[38]  Marcel Martin Cutadapt removes adapter sequences from high-throughput sequencing reads , 2011 .

[39]  B. Langmead,et al.  Aligning Short Sequencing Reads with Bowtie , 2010, Current protocols in bioinformatics.

[40]  N. Tuteja,et al.  Reactive oxygen species and antioxidant machinery in abiotic stress tolerance in crop plants. , 2010, Plant physiology and biochemistry : PPB.

[41]  J. Jez,et al.  Modulating plant hormones by enzyme action , 2010, Plant signaling & behavior.

[42]  Bo Wang,et al.  Resequencing of 31 wild and cultivated soybean genomes identifies patterns of genetic diversity and selection , 2010, Nature Genetics.

[43]  Jian Wang,et al.  Genome-wide patterns of genetic variation among elite maize inbred lines , 2010, Nature Genetics.

[44]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[45]  J. Noel,et al.  Enzymatic Functions of Wild Tomato Methylketone Synthases 1 and 21[W][OA] , 2010, Plant Physiology.

[46]  H. Hakonarson,et al.  ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data , 2010, Nucleic acids research.

[47]  Seth Debolt,et al.  Copy Number Variation Shapes Genome Diversity in Arabidopsis Over Immediate Family Generational Scales , 2010, Genome biology and evolution.

[48]  C. Glass,et al.  Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. , 2010, Molecular cell.

[49]  Nicola Pecchioni,et al.  CBF gene copy number variation at Frost Resistance-2 is associated with levels of freezing tolerance in temperate-climate cereals , 2010, Theoretical and Applied Genetics.

[50]  Aaron R. Quinlan,et al.  Bioinformatics Applications Note Genome Analysis Bedtools: a Flexible Suite of Utilities for Comparing Genomic Features , 2022 .

[51]  P. Stankiewicz,et al.  Structural variation in the human genome and its role in disease. , 2010, Annual review of medicine.

[52]  Lars Bolund,et al.  Building the sequence map of the human pan-genome , 2010, Nature Biotechnology.

[53]  Christopher Preston,et al.  Gene amplification confers glyphosate resistance in Amaranthus palmeri , 2009, Proceedings of the National Academy of Sciences.

[54]  Dawn H. Nagel,et al.  The B73 Maize Genome: Complexity, Diversity, and Dynamics , 2009, Science.

[55]  Patrick S. Schnable,et al.  Maize Inbreds Exhibit High Levels of Copy Number Variation (CNV) and Presence/Absence Variation (PAV) in Genome Content , 2009, PLoS genetics.

[56]  Jung Sun Kim,et al.  Genome-wide comparative analysis of the Brassica rapa gene space reveals genome shrinkage and differential loss of duplicated genes after whole genome triplication , 2009, Genome Biology.

[57]  J. Salojärvi,et al.  Unequally redundant RCD1 and SRO1 mediate stress and developmental responses and interact with transcription factors. , 2009, The Plant journal : for cell and molecular biology.

[58]  Xian-Jun Song,et al.  The ethylene response factors SNORKEL1 and SNORKEL2 allow rice to adapt to deep water , 2009, Nature.

[59]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[60]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[61]  M. Nei,et al.  Evolution of F-box genes in plants: Different modes of sequence divergence and their relationships with functional diversification , 2009, Proceedings of the National Academy of Sciences.

[62]  S. Henikoff,et al.  Intergenic Locations of Rice Centromeric Chromatin , 2008, PLoS biology.

[63]  E. Birney,et al.  Velvet: algorithms for de novo short read assembly using de Bruijn graphs. , 2008, Genome research.

[64]  Andrew H. Paterson,et al.  Synteny and Collinearity in Plant Genomes , 2008, Science.

[65]  Cai-guo Xu,et al.  Activation of the Indole-3-Acetic Acid–Amido Synthetase GH3-8 Suppresses Expansin Expression and Promotes Salicylate- and Jasmonate-Independent Basal Immunity in Rice[W] , 2008, The Plant Cell Online.

[66]  D. Spooner,et al.  Extensive simple sequence repeat genotyping of potato landraces supports a major reevaluation of their gene pool structure and classification , 2007, Proceedings of the National Academy of Sciences.

[67]  Sofia M. C. Robb,et al.  MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. , 2007, Genome research.

[68]  P. Langridge,et al.  Boron-Toxicity Tolerance in Barley Arising from Efflux Transporter Amplification , 2007, Science.

[69]  R. Martienssen,et al.  Transposable elements and the epigenetic regulation of the genome , 2007, Nature Reviews Genetics.

[70]  Michele Morgante,et al.  Transposable elements and the plant pan-genomes. , 2007, Current opinion in plant biology.

[71]  Mukesh Jain,et al.  Genome-wide analysis, evolutionary expansion, and expression of early auxin-responsive SAUR gene family in rice (Oryza sativa). , 2006, Genomics.

[72]  J. Bailey-Serres,et al.  Sub1A is an ethylene-response-factor-like gene that confers submergence tolerance to rice , 2006, Nature.

[73]  Richard W. Jones,et al.  Assessment of Linkage Disequilibrium in Potato Genome With Single Nucleotide Polymorphism Markers , 2006, Genetics.

[74]  Jun Wang,et al.  High Rate of Chimeric Gene Origination by Retroposition in Plant Genomes[W] , 2006, The Plant Cell Online.

[75]  Steven Maere,et al.  Genome duplication and the origin of angiosperms. , 2005, Trends in ecology & evolution.

[76]  J. Jurka,et al.  Repbase Update, a database of eukaryotic repetitive elements , 2005, Cytogenetic and Genome Research.

[77]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..

[78]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[79]  J. Ohlrogge,et al.  Metabolic, Genomic, and Biochemical Analyses of Glandular Trichomes from the Wild Tomato Species Lycopersicon hirsutum Identify a Key Enzyme in the Biosynthesis of Methylketonesw⃞ , 2005, The Plant Cell Online.

[80]  Andy Pereira,et al.  Faculty Opinions recommendation of Evolution of DNA sequence nonhomologies among maize inbreds. , 2005 .

[81]  Jutta Papenbrock,et al.  The multi-protein family of Arabidopsis sulphotransferases and their relatives in other plant species. , 2004, Journal of experimental botany.

[82]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[83]  Julie D Thompson,et al.  Multiple Sequence Alignment Using ClustalW and ClustalX , 2003, Current protocols in bioinformatics.

[84]  H. Fu,et al.  Intraspecific violation of genetic colinearity and its implications in maize , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[85]  G. Antonious PRODUCTION AND QUANTIFICATION OF METHYL KETONES IN WILD TOMATO ACCESSIONS , 2001, Journal of environmental science and health. Part. B, Pesticides, food contaminants, and agricultural wastes.

[86]  E. Stahl,et al.  Evolutionary Dynamics of Plant R-Genes , 2001, Science.

[87]  P. Dodds,et al.  Structure, function and evolution of plant disease resistance genes. , 2000, Current opinion in plant biology.

[88]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[89]  De Luca V,et al.  Developmental and light regulation of desacetoxyvindoline 4-hydroxylase in catharanthus roseus (L.) G. Don. . Evidence Of a multilevel regulatory mechanism , 1998, Plant physiology.

[90]  M. G. Kidwell,et al.  Transposable elements as sources of variation in animals and plants. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[91]  F. Marsolais,et al.  Biochemistry and molecular biology of plant sulfotransferases , 1997, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[92]  S. Tanksley,et al.  QTL analysis of trichome-mediated insect resistance in potato , 1994, Theoretical and Applied Genetics.

[93]  A. Chakravarti A graphical representation of genetic and physical maps: the Marey map. , 1991, Genomics.

[94]  J. G. Hawkes,et al.  The potato: evolution, biodiversity and genetic resources , 1990 .

[95]  T. Przewoźny,et al.  Comparison of single cell culture derived Solanum tuberosum L. plants and a model for their application in breeding programs , 1979, Theoretical and Applied Genetics.

[96]  H. D. Jong,et al.  Inbreeding in cultivated diploid potatoes , 1971, Potato Research.

[97]  G. Glevarec,et al.  Deciphering the Evolution, Cell Biology and Regulation of Monoterpene Indole Alkaloids , 2013 .

[98]  Louise V Wain,et al.  Copy number variation. , 2011, Methods in molecular biology.

[99]  R. Wing,et al.  Resolution of fluorescence in-situ hybridization mapping on rice mitotic prometaphase chromosomes, meiotic pachytene chromosomes and extended DNA fibers , 2004, Chromosome Research.

[100]  J. Valkonen,et al.  Organization of genes controlling disease resistance in the potato genome. , 2001, Annual review of phytopathology.

[101]  H. D. Cooper,et al.  The state of the use of potato genetic diversity. , 2000 .