The Gynandropsis gynandra genome provides insights into whole-genome duplications and the evolution of C4 photosynthesis in Cleomaceae

Abstract Gynandropsis gynandra (Cleomaceae) is a cosmopolitan leafy vegetable and medicinal plant, which has also been used as a model to study C4 photosynthesis due to its evolutionary proximity to C3 Arabidopsis (Arabidopsis thaliana). Here, we present the genome sequence of G. gynandra, anchored onto 17 main pseudomolecules with a total length of 740 Mb, an N50 of 42 Mb and 30,933 well-supported gene models. The G. gynandra genome and previously released genomes of C3 relatives in the Cleomaceae and Brassicaceae make an excellent model for studying the role of genome evolution in the transition from C3 to C4 photosynthesis. Our analyses revealed that G. gynandra and its C3 relative Tarenaya hassleriana shared a whole-genome duplication event (Gg-α), then an addition of a third genome (Th-α, +1×) took place in T. hassleriana but not in G. gynandra. Analysis of syntenic copy number of C4 photosynthesis-related gene families indicates that G. gynandra generally retained more duplicated copies of these genes than C3T. hassleriana, and also that the G. gynandra C4 genes might have been under positive selection pressure. Both whole-genome and single-gene duplication were found to contribute to the expansion of the aforementioned gene families in G. gynandra. Collectively, this study enhances our understanding of the polyploidy history, gene duplication and retention, as well as their impact on the evolution of C4 photosynthesis in Cleomaceae.

[1]  M. Schranz,et al.  Synteny Identifies Reliable Orthologs for Phylogenomics and Comparative Genomics of the Brassicaceae , 2022, bioRxiv.

[2]  K. K. Chaturvedi,et al.  Genomic Selection: A Tool for Accelerating the Efficiency of Molecular Breeding for Development of Climate-Resilient Crops , 2022, Frontiers in Genetics.

[3]  A. Weber,et al.  New Insights Into the Evolution of C4 Photosynthesis Offered by the Tarenaya Cluster of Cleomaceae , 2022, Frontiers in Plant Science.

[4]  M. Schranz,et al.  Ten Years of Gynandropsis gynandra Research for Improvement of Nutrient‐Rich Leaf Consumption: Lessons Learnt and Way Forwards , 2021, Annual Plant Reviews online.

[5]  Wen-Hsiung Li,et al.  Whole-Genome Duplication Facilitated the Evolution of C4 Photosynthesis in Gynandropsis gynandra , 2021, Molecular biology and evolution.

[6]  Makenzie E. Mabry,et al.  Comparative phylogenetics of repetitive elements in a diverse order of flowering plants (Brassicales) , 2021, G3.

[7]  H. Drost,et al.  Sensitive protein alignments at tree-of-life scale using DIAMOND , 2021, Nature Methods.

[8]  P. Rigault,et al.  Nested whole-genome duplications coincide with diversification and high morphological disparity in Brassicaceae , 2020, Nature Communications.

[9]  C. Kidner,et al.  The Origin of the Legumes is a Complex Paleopolyploid Phylogenomic Tangle Closely Associated with the Cretaceous–Paleogene (K–Pg) Mass Extinction Event , 2020, Systematic biology.

[10]  R. H. Mumm,et al.  Enhancing African orphan crops with genomics , 2020, Nature Genetics.

[11]  Paul D. Blischak,et al.  Phylogeny and multiple independent whole‐genome duplication events in the Brassicales , 2019, bioRxiv.

[12]  R. H. Mumm,et al.  African Orphan Crops Consortium (AOCC): status of developing genomic resources for African orphan crops , 2019, Planta.

[13]  R. D. de Vos,et al.  Association between vitamin content, plant morphology and geographical origin in a worldwide collection of the orphan crop Gynandropsis gynandra (Cleomaceae) , 2019, Planta.

[14]  A. Paterson,et al.  Gene duplication and evolution in recurring polyploidization–diploidization cycles in plants , 2019, Genome Biology.

[15]  S. Kelly,et al.  OrthoFinder: phylogenetic orthology inference for comparative genomics , 2019, Genome Biology.

[16]  E. Roalson,et al.  Lessons from Cleomaceae, the Sister of Crucifers. , 2018, Trends in plant science.

[17]  V. Ranwez,et al.  MACSE v2: Toolkit for the Alignment of Coding Sequences Accounting for Frameshifts and Stop Codons , 2018, Molecular biology and evolution.

[18]  Jian Wang,et al.  WEGO 2.0: a web tool for analyzing and plotting GO annotations, 2018 update , 2018, Nucleic Acids Res..

[19]  Chao Zhang,et al.  ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees , 2018, BMC Bioinformatics.

[20]  Asher Haug-Baltzell,et al.  A tutorial of diverse genome analysis tools found in the CoGe web-platform using Plasmodium spp. as a model , 2018, Database J. Biol. Databases Curation.

[21]  Hong Ma,et al.  Widespread Whole Genome Duplications Contribute to Genome Complexity and Species Diversity in Angiosperms. , 2018, Molecular plant.

[22]  J. Leebens-Mack,et al.  Brassicales phylogeny inferred from 72 plastid genes: A reanalysis of the phylogenetic localization of two paleopolyploid events and origin of novel chemical defenses. , 2018, American journal of botany.

[23]  Marianne L. Emery,et al.  Preferential retention of genes from one parental genome after polyploidy illustrates the nature and scope of the genomic conflicts induced by hybridization , 2018, PLoS genetics.

[24]  Leiting Li,et al.  Different Modes of Gene Duplication Show Divergent Evolutionary Patterns and Contribute Differently to the Expansion of Gene Families Involved in Important Fruit Traits in Pear (Pyrus bretschneideri) , 2018, Front. Plant Sci..

[25]  J. Hibberd,et al.  Ancient duons may underpin spatial patterning of gene expression in C4 leaves , 2018, Proceedings of the National Academy of Sciences.

[26]  C. Osborne,et al.  Gene duplication and dosage effects during the early emergence of C4 photosynthesis in the grass genus Alloteropsis , 2018, Journal of experimental botany.

[27]  Justin Chu,et al.  ARCS: scaffolding genome drafts with linked reads , 2017, Bioinform..

[28]  R. H. Mumm,et al.  A roadmap for breeding orphan leafy vegetable species: a case study of Gynandropsis gynandra (Cleomaceae) , 2018, Horticulture Research.

[29]  S. Neugart,et al.  Nutritional compound analysis and morphological characterization of spider plant (Cleome gynandra) - an African indigenous leafy vegetable. , 2017, Food research international.

[30]  S. Baldermann,et al.  Indigenous leafy vegetables of Eastern Africa - A source of extraordinary secondary plant metabolites. , 2017, Food research international.

[31]  T. Winkelmann,et al.  Mating biology, nuclear DNA content and genetic diversity in spider plant (Cleome gynandra) germplasm from various African countries , 2017 .

[32]  Han Fang,et al.  GenomeScope: Fast reference-free genome profiling from short reads , 2016, bioRxiv.

[33]  Sebastian Deorowicz,et al.  KMC 3: counting and manipulating k‐mer statistics , 2017, Bioinform..

[34]  James C. Schnable,et al.  FractBias: a graphical tool for assessing fractionation bias following polyploidy , 2016, Bioinform..

[35]  R. Dixon,et al.  The Differences between NAD-ME and NADP-ME Subtypes of C4 Photosynthesis: More than Decarboxylating Enzymes , 2016, Front. Plant Sci..

[36]  N. Weisenfeld,et al.  Direct determination of diploid genome sequences , 2016, bioRxiv.

[37]  Michael S. Barker,et al.  Most Compositae (Asteraceae) are descendants of a paleohexaploid and all share a paleotetraploid ancestor with the Calyceraceae , 2016, bioRxiv.

[38]  Arndt von Haeseler,et al.  W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis , 2016, Nucleic Acids Res..

[39]  S. Kelly,et al.  Independent and Parallel Evolution of New Genes by Gene Duplication in Two Origins of C4 Photosynthesis Provides New Insight into the Mechanism of Phloem Loading in C4 Species , 2016, Molecular biology and evolution.

[40]  M. Kanehisa,et al.  BlastKOALA and GhostKOALA: KEGG Tools for Functional Characterization of Genome and Metagenome Sequences. , 2016, Journal of molecular biology.

[41]  J. Hibberd,et al.  An Untranslated cis-Element Regulates the Accumulation of Multiple C4 Enzymes in Gynandropsis gynandra Mesophyll Cells[OPEN] , 2016, Plant Cell.

[42]  James C. Schnable,et al.  SynFind: Compiling Syntenic Regions across Any Set of Genomes on Demand , 2015, Genome biology and evolution.

[43]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..

[44]  Tandy J. Warnow,et al.  PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences , 2015, J. Comput. Biol..

[45]  J. Batley,et al.  A chromosome-based draft sequence of the hexaploid bread wheat (Triticum aestivum) genome , 2014, Science.

[46]  S. Kelly,et al.  Deep Evolutionary Comparison of Gene Expression Identifies Parallel Recruitment of Trans-Factors in Two Independent Origins of C4 Photosynthesis , 2014, PLoS genetics.

[47]  Kun Lu,et al.  The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes , 2014, Nature Communications.

[48]  Xiaowu Wang,et al.  Genome triplication drove the diversification of Brassica plants , 2014, Horticulture Research.

[49]  E. Roalson,et al.  Resolved phylogeny of Cleomaceae based on all three genomes , 2014 .

[50]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[51]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[52]  A. Weber,et al.  Gene and genome duplications and the origin of C4 photosynthesis: Birth of a trait in the Cleomaceae , 2014 .

[53]  Paul Medvedev,et al.  Informed and automated k-mer size selection for genome assembly , 2013, Bioinform..

[54]  Alisandra K. Denton,et al.  Comparative Transcriptome Atlases Reveal Altered Gene Expression Modules between Two Cleomaceae C 3 and C 4 Plant Species , 2014 .

[55]  E. Lyons,et al.  Whole Genome and Tandem Duplicate Retention Facilitated Glucosinolate Pathway Diversification in the Mustard Family , 2013, Genome biology and evolution.

[56]  S. Kamoun,et al.  Plant genome editing made easy: targeted mutagenesis in model and crop plants using the CRISPR/Cas system , 2013, Plant Methods.

[57]  Xun Xu,et al.  The Tarenaya hassleriana Genome Provides Insight into Reproductive Trait and Genome Evolution of Crucifers[W][OPEN] , 2013, Plant Cell.

[58]  U. Gowik,et al.  Evolution of C4 Photosynthesis in the Genus Flaveria: Establishment of a Photorespiratory CO2 Pump[W] , 2013, Plant Cell.

[59]  Alexey A. Gurevich,et al.  QUAST: quality assessment tool for genome assemblies , 2013, Bioinform..

[60]  Dannie Durand,et al.  Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees , 2012, Bioinform..

[61]  G. Bonnema,et al.  Biased Gene Fractionation and Dominant Gene Expression among the Subgenomes of Brassica rapa , 2012, PloS one.

[62]  J. Hibberd,et al.  Molecular evolution of genes recruited into C₄ photosynthesis. , 2012, Trends in plant science.

[63]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[64]  Maxim Teslenko,et al.  MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space , 2012, Systematic biology.

[65]  Jeremy D. DeBarry,et al.  MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity , 2012, Nucleic acids research.

[66]  J. Poulain,et al.  The genome of the mesopolyploid crop species Brassica rapa , 2011, Nature Genetics.

[67]  Colin N. Dewey,et al.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome , 2011, BMC Bioinformatics.

[68]  R. Sage,et al.  The C(4) plant lineages of planet Earth. , 2011, Journal of experimental botany.

[69]  Jocelyn C Hall,et al.  Studies in the Cleomaceae I. On the Separate Recognition of Capparaceae, Cleomaceae, and Brassicaceae1 , 2011 .

[70]  A. Perrin,et al.  Independent and Parallel Recruitment of Preexisting Mechanisms Underlying C4 Photosynthesis , 2011, Science.

[71]  G. Edwards,et al.  Diversity in forms of C4 in the genus Cleome (Cleomaceae). , 2011, Annals of botany.

[72]  W. Pirovano,et al.  Scaffolding pre-assembled contigs using SSPACE , 2011, Bioinform..

[73]  Mark A. Miller,et al.  Creating the CIPRES Science Gateway for inference of large phylogenetic trees , 2010, 2010 Gateway Computing Environments Workshop (GCE).

[74]  G. Edwards,et al.  Biogeographic Patterns of Diversification and the Origins of C4 in Cleome (Cleomaceae) , 2010 .

[75]  U. Gowik,et al.  The Path from C3 to C4 Photosynthesis1 , 2010, Plant Physiology.

[76]  M. Lercher,et al.  An mRNA Blueprint for C4 Photosynthesis Derived from Comparative Transcriptomics of Closely Related C3 and C4 Species1[W][OA] , 2010, Plant Physiology.

[77]  Jun Yu,et al.  KaKs_Calculator 2.0: A Toolkit Incorporating Gamma-Series Methods and Sliding Window Strategies , 2010, Genom. Proteom. Bioinform..

[78]  U. Gowik,et al.  Agrobacterium tumefaciens-mediated transformation of Cleome gynandra L., a C4 dicotyledon that is closely related to Arabidopsis thaliana , 2010, Journal of experimental botany.

[79]  Huanming Yang,et al.  De novo assembly of human genomes with massively parallel short read sequencing. , 2010, Genome research.

[80]  David Sankoff,et al.  The collapse of gene complement following whole genome duplication , 2010, BMC Genomics.

[81]  Steven J. M. Jones,et al.  Circos: an information aesthetic for comparative genomics. , 2009, Genome research.

[82]  Haibao Tang,et al.  Comparative genomic analysis of C4 photosynthetic pathway evolution in grasses , 2009, Genome Biology.

[83]  Jun Yu,et al.  γ-MYN: a new algorithm for estimating Ka and Ks with consideration of variable substitution rates , 2009, Biology Direct.

[84]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[85]  Toni Gabaldón,et al.  trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses , 2009, Bioinform..

[86]  Adam P. Arkin,et al.  FastTree: Computing Large Minimum Evolution Trees with Profiles instead of a Distance Matrix , 2009, Molecular biology and evolution.

[87]  J. Clarke Cetyltrimethyl ammonium bromide (CTAB) DNA miniprep for plant DNA isolation. , 2009, Cold Spring Harbor protocols.

[88]  Michael Freeling,et al.  The Value of Nonmodel Genomes and an Example Using SynMap Within CoGe to Dissect the Hexaploidy that Predates the Rosids , 2008, Tropical Plant Biology.

[89]  Andrew H. Paterson,et al.  Synteny and Collinearity in Plant Genomes , 2008, Science.

[90]  Stephen M. Mount,et al.  The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus) , 2008, Nature.

[91]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[92]  N. Harriman Promoting the conservation and use of underutilized and neglected crops. 11. Cat’s whiskers.Cleome gynandra , 2008, Economic Botany.

[93]  N. Harriman,et al.  Promoting the conservation and use of underutilized and neglected crops. 13. Sago palm,Metroxylon sagu , 2008, Economic Botany.

[94]  E. V. D. Heever,et al.  NUTRITIONAL AND MEDICINAL PROPERTIES OF CLEOME GYNANDRA , 2007 .

[95]  J. Poulain,et al.  The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla , 2007, Nature.

[96]  H. Griffiths,et al.  Cleome, a genus closely related to Arabidopsis, contains species spanning a developmental progression from C(3) to C(4) photosynthesis. , 2007, The Plant journal : for cell and molecular biology.

[97]  K. H. Wolfe,et al.  Not born equal: increased rate asymmetry in relocated and retrotransposed rodent gene duplicates. , 2006, Molecular biology and evolution.

[98]  G. Weinstock,et al.  Creating a honey bee consensus gene set , 2007, Genome Biology.

[99]  Brian C. Thomas,et al.  Following tetraploidy in an Arabidopsis ancestor, genes were removed preferentially from one homeolog leaving clusters enriched in dose-sensitive genes. , 2006, Genome research.

[100]  Peer Bork,et al.  PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments , 2006, Nucleic Acids Res..

[101]  D. Liberles,et al.  A systematic search for positive selection in higher plants (Embryophytes) , 2006, BMC Plant Biology.

[102]  Thomas Mitchell-Olds,et al.  Independent Ancient Polyploidy Events in the Sister Families Brassicaceae and Cleomaceae[W] , 2006, The Plant Cell Online.

[103]  R W Doerge,et al.  Genomewide Nonadditive Gene Regulation in Arabidopsis Allotetraploids , 2006, Genetics.

[104]  Burkhard Morgenstern,et al.  AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints , 2005, Nucleic Acids Res..

[105]  J. Hibberd,et al.  The future of C4 research--maize, Flaveria or Cleome? , 2005, Trends in plant science.

[106]  J. Raes,et al.  Modeling gene and genome duplications in eukaryotes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[107]  Steven Salzberg,et al.  TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders , 2004, Bioinform..

[108]  Ian Korf,et al.  Gene finding in novel genomes , 2004, BMC Bioinformatics.

[109]  Nansheng Chen,et al.  Using RepeatMasker to Identify Repetitive Elements in Genomic Sequences , 2009, Current protocols in bioinformatics.

[110]  R. Sage The evolution of C 4 photosynthesis , 2003 .

[111]  R. Monson Gene Duplication, Neofunctionalization, and the Evolution of C4 Photosynthesis , 2003, International Journal of Plant Sciences.

[112]  A. Paterson,et al.  Rate variation among nuclear genes and the age of polyploidy in Gossypium. , 2003, Molecular biology and evolution.

[113]  Maria Jesus Martin,et al.  High-quality Protein Knowledge Resource: SWISS-PROT and TrEMBL , 2002, Briefings Bioinform..

[114]  K. Katoh,et al.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.

[115]  Rolf Apweiler,et al.  InterProScan - an integration platform for the signature-recognition methods in InterPro , 2001, Bioinform..

[116]  M. Lynch,et al.  The evolutionary fate and consequences of duplicate genes. , 2000, Science.

[117]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[118]  Wei Qian,et al.  Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. , 2000, Molecular biology and evolution.

[119]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[120]  M. D. Hatch,et al.  Carbonic anhydrase activity in leaves and its role in the first step of c(4) photosynthesis. , 1990, Plant physiology.

[121]  R. Slatyer,et al.  Photosynthesis and photorespiration. , 1971, Science.

[122]  Yasuko Takahashi,et al.  Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events , 2022 .