Draft genome sequence of Solanum aethiopicum provides insights into disease resistance, drought tolerance, and the evolution of the genome

Background S. aethiopicum is a close relative to S. melongena and has been routinely used to improve disease resistance in S. melongena. However, these efforts have been greatly limited by the lack of a reference genome and the clear understanding of the genes involved during biotic and abiotic stress response. Results We present here a draft genome assembly of S. aethiopicum of 1.02 Gb in size, which is predominantly occupied by repetitive sequences (76.2%), particularly long terminal repeat elements. We annotated 37,681 gene models including 34,905 protein-coding genes. We observed an expansion of resistance genes through two rounds of amplification of LTR-Rs, occurred around 1.25 and 3.5 million years ago, respectively. The expansion also occurred in gene families related to drought tolerance. A number of 14,995,740 SNPs are identified by re-sequencing 65 S. aethiopicum genotypes including “Gilo” and “Shum” accessions, 41,046 of which are closely linked to resistance genes. The domestication and demographic history analysis reveals selection of genes involved in drought tolerance in both “Gilo” and “Shum” groups. A pan-genome of S. aethiopicum with a total of 36,250 protein-coding genes was assembled, of which 1,345 genes are missing in the reference genome. Conclusions Overall, the genome sequence of S. aethiopicum increases our understanding of the genomic mechanisms of its extraordinary disease resistance and drought tolerance. The SNPs identified are available for potential use by breeders. The information provided here will greatly accelerate the selection and breeding of the African eggplant as well as other crops within the Solanaceae family.

[1]  Yves Van de Peer,et al.  The draft genomes of five agriculturally important African orphan crops , 2018, GigaScience.

[2]  Silvio C. E. Tosatto,et al.  InterPro in 2019: improving coverage, classification and access to protein sequence annotations , 2018, Nucleic Acids Res..

[3]  Wenbin Chen,et al.  Dinoflagellates, a Unique Lineage for Retrogene Research , 2018, Front. Microbiol..

[4]  Jie Huang BGISEQ-500 WGS library construction , 2018 .

[5]  Kazutaka Katoh,et al.  Parallelization of MAFFT for large-scale multiple sequence alignments , 2018, Bioinform..

[6]  The Uniprot Consortium UniProt: the universal protein knowledgebase , 2018, Nucleic acids research.

[7]  Robert D. Finn,et al.  Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families , 2017, Nucleic Acids Res..

[8]  J. Haber,et al.  CRISPR/Cas9 cleavages in budding yeast reveal templated insertions and strand-specific insertion/deletion profiles , 2017, Proceedings of the National Academy of Sciences.

[9]  S. Sane,et al.  Gene Regulation and Species-Specific Evolution of Free Flight Odor Tracking in Drosophila , 2018, Molecular biology and evolution.

[10]  Ryan W. Kim,et al.  New reference genome sequences of hot pepper reveal the massive evolution of plant disease-resistance genes by retroduplication , 2017, Genome Biology.

[11]  M. Deyholos,et al.  LTR-retrotransposons in plants: Engines of evolution. , 2017, Gene.

[12]  Robert M. Waterhouse,et al.  BUSCO Applications from Quality Assessments to Gene Prediction and Phylogenomics , 2017, bioRxiv.

[13]  Wenbin Chen,et al.  Comparative Genomics Reveals Two Major Bouts of Gene Retroposition Coinciding with Crucial Periods of Symbiodinium Evolution , 2017, Genome biology and evolution.

[14]  A. Kovařík,et al.  Third release of the plant rDNA database with updated content and information on telomere composition and sequenced plant genomes , 2017, Plant Systematics and Evolution.

[15]  The Gene Ontology Consortium,et al.  Expansion of the Gene Ontology knowledgebase and resources , 2016, Nucleic Acids Res..

[16]  The Gene Ontology Consortium Expansion of the Gene Ontology knowledgebase and resources , 2016, Nucleic Acids Res..

[17]  D. Choi,et al.  Genome-Wide Comparative Analyses Reveal the Dynamic Evolution of Nucleotide-Binding Leucine-Rich Repeat Gene Family among Solanaceae Plants , 2016, Front. Plant Sci..

[18]  Rémy Bruggmann,et al.  Insight into the evolution of the Solanaceae from the parental genomes of Petunia hybrida , 2016, Nature Plants.

[19]  M. Plazas,et al.  Transcriptome analysis and molecular marker discovery in Solanum incanum and S. aethiopicum, two close relatives of the common eggplant (Solanum melongena) with interest for breeding , 2016, BMC Genomics.

[20]  Siu-Ming Yiu,et al.  Redefining the structural motifs that determine RNA binding and RNA editing by pentatricopeptide repeat proteins in land plants. , 2016, The Plant journal : for cell and molecular biology.

[21]  B. Meyers,et al.  Extensive Families of miRNAs and PHAS Loci in Norway Spruce Demonstrate the Origins of Complex phasiRNA Networks in Seed Plants , 2015, Molecular biology and evolution.

[22]  O. Kohany,et al.  Repbase Update, a database of repetitive elements in eukaryotic genomes , 2015, Mobile DNA.

[23]  M. Causse,et al.  Potential of a tomato MAGIC population to decipher the genetic control of quantitative traits and detect causal variants in the resequencing era. , 2015, Plant biotechnology journal.

[24]  Steven L Salzberg,et al.  HISAT: a fast spliced aligner with low memory requirements , 2015, Nature Methods.

[25]  S. Salzberg,et al.  StringTie enables improved reconstruction of a transcriptome from RNA-seq reads , 2015, Nature Biotechnology.

[26]  Carson C Chow,et al.  Second-generation PLINK: rising to the challenge of larger and richer datasets , 2014, GigaScience.

[27]  K. Manning,et al.  The demographic response to Holocene climate change in the Sahara , 2014 .

[28]  Hideki Hirakawa,et al.  Draft Genome Sequence of Eggplant (Solanum melongena L.): the Representative Solanum Species Indigenous to the Old World , 2014, DNA research : an international journal for rapid publication of reports on genes and genomes.

[29]  M. Plazas,et al.  Conventional and phenomics characterization provides insight into the diversity and relationships of hypervariable scarlet (Solanum aethiopicum L.) and gboma (S. macrocarpon L.) eggplant complexes , 2014, Front. Plant Sci..

[30]  Tetsuya Hayashi,et al.  Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads , 2014, Genome research.

[31]  A. Krogh,et al.  Whole-genome sequencing of cultivated and wild peppers provides insights into Capsicum domestication and specialization , 2014, Proceedings of the National Academy of Sciences.

[32]  Yeisoo Yu,et al.  Genome sequence of the hot pepper provides insights into the evolution of pungency in Capsicum species , 2014, Nature Genetics.

[33]  Paul Medvedev,et al.  Informed and automated k-mer size selection for genome assembly , 2013, Bioinform..

[34]  Shancen Zhao,et al.  ReSeqTools: an integrated toolkit for large-scale next-generation sequencing based resequencing analysis. , 2013, Genetics and molecular research : GMR.

[35]  Jianying Yuan,et al.  Estimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects , 2013, 1308.2012.

[36]  P. Visscher,et al.  Genome-wide complex trait analysis (GCTA): methods, data analyses, and interpretations. , 2013, Methods in molecular biology.

[37]  M. Peitsch,et al.  Reference genomes and transcriptomes of Nicotiana sylvestris and Nicotiana tomentosiformis , 2013, Genome Biology.

[38]  Jian Wang,et al.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler , 2012, GigaScience.

[39]  O. Obidoa,et al.  Membrane stabilization as a mechanism of the anti-inflammatory activity of methanol extract of garden egg (Solanum aethiopicum) , 2012, DARU Journal of Pharmaceutical Sciences.

[40]  Zhengwei Zhu,et al.  CD-HIT: accelerated for clustering the next-generation sequencing data , 2012, Bioinform..

[41]  O. Adeniji,et al.  Genetic diversity among accessions of Solanum aethiopicum L. groups based on morpho-agronomic traits , 2012, Plant Genetic Resources.

[42]  Daniel W. A. Buchan,et al.  The tomato genome sequence provides insights into fleshy fruit evolution , 2012, Nature.

[43]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .

[44]  J. M. Seguí-Simarro,et al.  Characterization of interspecific hybrids and first backcross generations from crosses between two cultivated eggplants (Solanum melongena and S. aethiopicum Kumba group) and implications for eggplant breeding , 2012, Euphytica.

[45]  B. Meyers,et al.  Tracing the origin and evolutionary history of plant nucleotide-binding site-leucine-rich repeat (NBS-LRR) genes. , 2012, The New phytologist.

[46]  Jeremy D. DeBarry,et al.  MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity , 2012, Nucleic acids research.

[47]  R. Michelmore,et al.  Development and application of a 6.5 million feature Affymetrix Genechip® for massively parallel discovery of single position polymorphisms in lettuce (Lactuca spp.) , 2012, BMC Genomics.

[48]  Haibao Tang,et al.  Comparative analysis of peanut NBS-LRR gene clusters suggests evolutionary innovation among duplicated domains and erosion of gene microsynteny. , 2011, The New phytologist.

[49]  R. Vierstra,et al.  The ATG1/ATG13 Protein Kinase Complex Is Both a Regulator and a Target of Autophagic Recycling in Arabidopsis[C][W] , 2011, Plant Cell.

[50]  D. Tang,et al.  The autophagy gene, ATG18a, plays a negative role in powdery mildew resistance and mildew-induced cell death in Arabidopsis , 2011, Plant signaling & behavior.

[51]  R. Durbin,et al.  Inference of human population history from individual whole-genome sequences. , 2011, Nature.

[52]  R. Durbin,et al.  Inference of Human Population History From Whole Genome Sequence of A Single Individual , 2011, Nature.

[53]  David M. A. Martin,et al.  Genome sequence and analysis of the tuber crop potato , 2011, Nature.

[54]  J. Prohens,et al.  Eggplant relatives as sources of variation for developing new rootstocks: Effects of grafting on eggplant yield and fruit apparent quality and composition , 2011 .

[55]  Detlef Weigel,et al.  Transposable elements and small RNAs contribute to gene expression divergence between Arabidopsis thaliana and Arabidopsis lyrata , 2011, Proceedings of the National Academy of Sciences.

[56]  M. C. Fiore,et al.  Genetic diversity and characterization of African eggplant germplasm collection , 2010 .

[57]  M. Fibiani,et al.  Characterization of health-related compounds in eggplant (Solanum melongena L.) lines derived from introgression of allied species. , 2010, Journal of agricultural and food chemistry.

[58]  O. Gascuel,et al.  New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. , 2010, Systematic biology.

[59]  Paramvir S. Dehal,et al.  FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments , 2010, PloS one.

[60]  Christina A. Cuomo,et al.  Source (or Part of the following Source): Type Article Title Comparative Genomics Reveals Mobile Pathogenicity Chromosomes in Fusarium Author(s) , 2022 .

[61]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[62]  S. Kurtz,et al.  Fine-grained annotation and classification of de novo predicted LTR retrotransposons , 2009, Nucleic acids research.

[63]  Y. Morimoto,et al.  Biodiversity of African Vegetables , 2009 .

[64]  C. Shackleton,et al.  African indigenous vegetables in urban agriculture , 2009 .

[65]  David H. Alexander,et al.  Fast model-based estimation of ancestry in unrelated individuals. , 2009, Genome research.

[66]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[67]  B. Gaut,et al.  Epigenetic silencing of transposable elements: a trade-off between reduced transposition and deleterious effects on neighboring gene expression. , 2009, Genome research.

[68]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[69]  Imara Y. Perera,et al.  Transgenic Arabidopsis Plants Expressing the Type 1 Inositol 5-Phosphatase Exhibit Increased Drought Tolerance and Altered Abscisic Acid Signaling[W] , 2008, The Plant Cell Online.

[70]  G. Valè,et al.  Inheritance of Fusarium wilt resistance introgressed from Solanum aethiopicum Gilo and Aculeatum groups into cultivated eggplant (S. melongena) and development of associated PCR-based markers , 2008, Molecular Breeding.

[71]  Stefan Kurtz,et al.  LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons , 2008, BMC Bioinformatics.

[72]  V. Khasdan,et al.  Large-Scale Survey of Cytosine Methylation of Retrotransposons and the Impact of Readout Transcription From Long Terminal Repeats on Expression of Adjacent Rice Genes , 2007, Genetics.

[73]  B. Browning,et al.  Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. , 2007, American journal of human genetics.

[74]  J. Poulain,et al.  The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla , 2007, Nature.

[75]  P. Larsen,et al.  Arabidopsis ALS1 encodes a root tip and stele localized half type ABC transporter required for root growth in an aluminum toxic environment , 2007, Planta.

[76]  G. Weinstock,et al.  Creating a honey bee consensus gene set , 2007, Genome Biology.

[77]  Burkhard Morgenstern,et al.  AUGUSTUS: ab initio prediction of alternative transcripts , 2006, Nucleic Acids Res..

[78]  Patrice Koehl,et al.  Plant NBS-LRR proteins: adaptable guards , 2006, Genome Biology.

[79]  S. Kushnir,et al.  AtATM3 Is Involved in Heavy Metal Resistance in Arabidopsis1 , 2006, Plant Physiology.

[80]  Ziheng Yang,et al.  Bayesian estimation of species divergence times under a molecular clock using multiple fossil calibrations with soft bounds. , 2006, Molecular biology and evolution.

[81]  Mark Daly,et al.  Haploview: analysis and visualization of LD and haplotype maps , 2005, Bioinform..

[82]  Jianxin Ma,et al.  Rapid recent growth and divergence of rice nuclear genomes. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[83]  R. Durbin,et al.  GeneWise and Genomewise. , 2004, Genome research.

[84]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[85]  Darren A. Natale,et al.  The COG database: an updated version includes eukaryotes , 2003, BMC Bioinformatics.

[86]  C. Stoeckert,et al.  OrthoMCL: identification of ortholog groups for eukaryotic genomes. , 2003, Genome research.

[87]  J. Greilhuber,et al.  Analysis of nuclear DNA content in Capsicum (Solanaceae) by flow cytometry and Feulgen densitometry. , 2003, Annals of botany.

[88]  A. Levy,et al.  Transcriptional activation of retrotransposons alters the expression of adjacent genes in wheat , 2003, Nature Genetics.

[89]  C. Collonnier,et al.  Androgenic dihaploids from somatic hybrids between Solanum melongena and S. aethiopicum group gilo as a source of resistance to Fusarium oxysporum f. sp. melongenae , 2002, Plant Cell Reports.

[90]  C. Collonnier,et al.  Source of resistance against Ralstonia solanacearum in fertile somatic hybrids of eggplant (Solanum melongena L.) with Solanum aethiopicum L. , 2001, Plant science : an international journal of experimental plant biology.

[91]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[92]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[93]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[94]  R. Schippers African Indigenous Vegetables: An overview of the cultivated species. , 2000 .

[95]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..

[96]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[97]  M. Ganal,et al.  A root-specific iron-regulated gene of tomato encodes a lysyl-tRNA-synthetase-like protein. , 1997, European journal of biochemistry.

[98]  S. Eddy,et al.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. , 1997, Nucleic acids research.

[99]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[100]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[101]  Julian A. Peterson,et al.  A microcomputer network for biochemistry , 1985, Comput. Appl. Biosci..