Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication

Cultivated citrus are selections from, or hybrids of, wild progenitor species whose identities and contributions to citrus domestication remain controversial. Here we sequence and compare citrus genomes—a high-quality reference haploid clementine genome and mandarin, pummelo, sweet-orange and sour-orange genomes—and show that cultivated types derive from two progenitor species. Although cultivated pummelos represent selections from one progenitor species, Citrus maxima, cultivated mandarins are introgressions of C. maxima into the ancestral mandarin species Citrus reticulata. The most widely cultivated citrus, sweet orange, is the offspring of previously admixed individuals, but sour orange is an F1 hybrid of pure C. maxima and C. reticulata parents, thus implying that wild mandarins were part of the early breeding germplasm. A Chinese wild 'mandarin' diverges substantially from C. reticulata, thus suggesting the possibility of other unrecognized wild citrus species. Understanding citrus phylogeny through genome analysis clarifies taxonomic relationships and facilitates sequence-directed genetic improvement.

Andrea Zuccolo | Mark Borodovsky | Simon Prochnik | Karine Labadie | Patrick Wincker | Kamel Jabbari | Jeremy Schmutz | Simone Scalabrin | Jarrod Chapman | Xavier Perrier | Dominique Brunel | Alexandre Lomsadze | Michele Morgante | François Luro | Julie Poulain | Manuel Ruiz | Florent Murat | Jane Grimwood | Karin M. Fredrikson | Federica Cattonaro | Daniel Ramón | Luis Navarro | Pablo Aleza | Uffe Hellsten | Marcos Antonio Machado | Cristian Del Fabbro | Javier Terol | Victoria Ibanez | Daniel Rokhsar | M. Borodovsky | J. Poulain | A. Lomsadze | T. Harkins | J. Chapman | O. Jaillon | P. Wincker | F. Quétier | J. Grimwood | J. Schmutz | P. Burns | D. Rokhsar | W. Farmerie | M. Morgante | A. Couloux | C. Kodira | S. Prochnik | B. Desany | U. Hellsten | J. Jenkins | Florent Murat | J. Salse | M. Talón | J. Terol | F. Tadeo | P. Ollitrault | K. Labadie | K. Jabbari | F. Cattonaro | Cristian Del Fabbro | S. Scalabrin | A. Zuccolo | Chunxian Chen | F. Gmitter | L. Navarro | D. Ramón | X. Perrier | D. Brunel | F. Luro | M. Ruiz | S. Pinosio | M. A. Machado | P. Aleza | M. Roose | M. Takita | M. Mohiuddin | G. A. Wu | L. H. Estornell | Victoria Ibanez | Amparo Herrero-Ortega | Giuseppe Reforgiato | J. Freitas-Astúa | M. Machado | Brian Desany | Jerry Jenkins | G Albert Wu | Jerome Salse | Marco Aurélio Takita | Arnaud Couloux | Sara Pinosio | Francisco R Tadeo | Leandro H Estornell | Juan V Muñoz-Sanz | Amparo Herrero-Ortega | Julián Pérez-Pérez | Chunxian Chen | William G Farmerie | Chinnappa Kodira | Mohammed Mohiuddin | Tim Harkins | Karin Fredrikson | Paul Burns | Giuseppe Reforgiato | Juliana Freitas-Astúa | Francis Quetier | Mikeal Roose | Manuel Talon | Olivier Jaillon | Patrick Ollitrault | Frederick Gmitter | J. V. Muñoz-Sanz | J. Pérez-Pérez | J. Pérez‐Pérez | Julián Pérez-Pérez

[1]  R. Jansen,et al.  The complete chloroplast genome sequence of Citrus sinensis (L.) Osbeck var 'Ridge Pineapple': organization and phylogenetic relationships to other angiosperms , 2006, BMC Plant Biology.

[2]  M. Kimura,et al.  An introduction to population genetics theory , 1971 .

[3]  E. Nicolosi,et al.  Citrus phylogeny and genetic origin of important species as investigated by molecular markers , 2000, Theoretical and Applied Genetics.

[4]  M. Slatkin,et al.  Estimation of levels of gene flow from DNA sequence data. , 1992, Genetics.

[5]  Y. Tashiro,et al.  Maternal inheritance of chloroplast DNA in intergeneric sexual hybrids of “true citrus fruit trees” revealed by PCR-RFLP analysis , 2004 .

[6]  P. Ollitrault,et al.  New universal mitochondrial PCR markers reveal new information on maternal citrus phylogeny , 2011, Tree Genetics & Genomes.

[7]  T. Giraud,et al.  New Insight into the History of Domesticated Apple: Secondary Contribution of the European Wild Apple to the Genome of Cultivated Varieties , 2012, PLoS genetics.

[8]  Xun Xu,et al.  Comparative population genomics of maize domestication and improvement , 2012, Nature Genetics.

[9]  R. Scora,et al.  On the History and Origin of Citrus , 1975 .

[10]  J. Bohlmann,et al.  Genome evolution and evolutionary systems biology , 2012 .

[11]  L. Samaan Studies on the origin of Clementine tangerine (Citrus reticulataBlanco) , 1982, Euphytica.

[12]  Dominique Brunel,et al.  SNP mining in C. clementina BAC end sequences; transferability in the Citrus genus (Rutaceae), phylogenetic inferences and perspectives for genetic mapping , 2012, BMC Genomics.

[13]  Stephen M. Mount,et al.  Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. , 2003, Nucleic acids research.

[14]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[15]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[16]  Ewan Birney,et al.  Automated generation of heuristics for biological sequence comparison , 2005, BMC Bioinformatics.

[17]  M T Clegg,et al.  Substitution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[18]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[19]  R. Hudson,et al.  Interrogating multiple aspects of variation in a full resequencing data set to infer human population size changes. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[20]  P. Ollitrault,et al.  Recovery and characterization of a Citrus clementina Hort. ex Tan. 'Clemenules' haploid plant selected to establish the reference whole Citrus genome sequence , 2009, BMC Plant Biology.

[21]  J. Janick Plant Breeding Reviews , 1983, Springer US.

[22]  G. Drouin,et al.  Relative rates of synonymous substitutions in the mitochondrial, chloroplast and nuclear genomes of seed plants. , 2008, Molecular phylogenetics and evolution.

[23]  E. Mauceli,et al.  Whole-genome sequence assembly for mammalian genomes: Arachne 2. , 2003, Genome research.

[24]  M. Talón,et al.  Citrus Genomics , 2008, International journal of plant genomics.

[25]  Nicholas A. Johnson,et al.  Ancestral Components of Admixed Genomes in a Mexican Cohort , 2011, PLoS genetics.

[26]  Michael Lynch,et al.  Estimation of nucleotide diversity, disequilibrium coefficients, and mutation rates from high-coverage genome-sequencing projects. , 2008, Molecular biology and evolution.

[27]  Deborah A Nickerson,et al.  Population History and Natural Selection Shape Patterns of Genetic Variation in 132 Genes , 2004, PLoS biology.

[28]  M. Gribskov,et al.  The Genome of Black Cottonwood, Populus trichocarpa (Torr. & Gray) , 2006, Science.

[29]  Hua Tang,et al.  Estimating kinship in admixed populations. , 2012, American journal of human genetics.

[30]  M. Talón,et al.  A reference genetic map of C. clementina hort. ex Tan.; citrus evolution inferences from comparative mapping , 2012, BMC Genomics.

[31]  J. Morton,et al.  In: Fruits of warm climates , 2013 .

[32]  S. He,et al.  Two new species of Citrus in China. , 1990 .

[33]  H. Prado,et al.  Advances and Applications , 2010 .

[34]  J. Poulain,et al.  The genome of Theobroma cacao , 2011, Nature Genetics.

[35]  M. Borodovsky,et al.  Gene identification in novel eukaryotic genomes by self-training algorithm , 2005, Nucleic acids research.

[36]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[37]  J. Grosser,et al.  Verification of Mandarin and Pummelo Somatic Hybrids by Expressed Sequence Tag-Simple Sequence Repeat Marker Analysis , 2008 .

[38]  T. Gottwald Current epidemiological understanding of citrus Huanglongbing . , 2010, Annual review of phytopathology.

[39]  Edward S. Buckler,et al.  Crop genomics: advances and applications , 2011, Nature Reviews Genetics.

[40]  P. Ollitrault,et al.  A nuclear phylogenetic analysis: SNPs, indels and SSRs deliver new insights into the relationships in the 'true citrus fruit trees' group (Citrinae, Rutaceae) and the origin of cultivated species. , 2013, Annals of botany.

[41]  C. Burge,et al.  Computational inference of homologous gene structures in the human genome. , 2001, Genome research.

[42]  M. Crisp,et al.  The age and biogeography of Citrus and the orange subfamily (Rutaceae: Aurantioideae) in Australasia and New Caledonia. , 2008, American journal of botany.

[43]  H. Yi,et al.  Genetic diversity in mandarin landraces and wild mandarins from China based on nuclear and chloroplast simple sequence repeat markers , 2006 .

[44]  Phillip SanMiguel,et al.  The paleontology of intergene retrotransposons of maize , 1998, Nature Genetics.

[45]  Mattias Jakobsson,et al.  Deep divergences of human gene trees and models of human origins. , 2011, Molecular biology and evolution.

[46]  H. Webber,et al.  The Citrus industry , 2016 .

[47]  Jérôme Salse,et al.  Improved criteria and comparative genomics tool provide new insights into grass paleogenomics , 2009, Briefings Bioinform..

[48]  Gary K. Chen,et al.  Fast and flexible simulation of DNA sequence data. , 2008, Genome research.

[49]  J. Salse In silico archeogenomics unveils modern plant genome organisation, regulation and evolution. , 2012, Current opinion in plant biology.

[50]  W.‐C. Lee Testing the Genetic Relation Between Two Individuals Using a Panel of Frequency‐unknown Single Nucleotide Polymorphisms , 2003, Annals of human genetics.

[51]  E. Birney,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[52]  Josyf Mychaleckyj,et al.  Robust relationship inference in genome-wide association studies , 2010, Bioinform..

[53]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[54]  M. Talón,et al.  Development of genomic resources for Citrus clementina: Characterization of three deep-coverage BAC libraries and analysis of 46,000 BAC end sequences , 2008, BMC Genomics.

[55]  David M. Goodstein,et al.  Phytozome: a comparative platform for green plant genomics , 2011, Nucleic Acids Res..

[56]  Edward S. Buckler,et al.  Genetic structure and domestication history of the grape , 2011, Proceedings of the National Academy of Sciences.

[57]  M. Talón,et al.  Cytological and molecular characterization of three gametoclones of Citrus clementina , 2013, BMC Plant Biology.

[58]  H. C. Barrett,et al.  A Numerical Taxonomic Study of Affinity Relationships in Cultivated Citrus and Its Close Relatives , 1976 .

[59]  D. Xiuxin,et al.  Phylogenetic Analysis of Mandarin Landraces, Wild Mandarins, and Related Species in China Using Nuclear LEAFY Second Intron and Plastid trnL-trnF Sequence , 2007 .

[60]  J. Bové,et al.  Huanglongbing: a destructive, newly-emerging, century-old disease of citrus [Asia; South Africa; Brazil; Florida] , 2006 .

[61]  Niranjan Nagarajan,et al.  The draft genome of sweet orange (Citrus sinensis) , 2012, Nature Genetics.

[62]  R. Krueger,et al.  Assessing genetic diversity and population structure in a citrus germplasm collection utilizing simple sequence repeat markers (SSRs) , 2006, Theoretical and Applied Genetics.

[63]  Anushya Muruganujan,et al.  PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium , 2009, Nucleic Acids Res..

[64]  Gregory D Schuler,et al.  Sequence mapping by electronic PCR , 1997, Genome research.

[65]  F. Gmitter,et al.  The possible role of Yunnan, China, in the origin of contemporary citrus species (rutaceae) , 1990, Economic Botany.

[66]  G. Moore Oranges and lemons: clues to the taxonomy of Citrus from molecular markers. , 2001, Trends in genetics : TIG.

[67]  M. Cristofani,et al.  Development and characterization of polymorphic microsatellite markers for the sweet orange (Citrus sinensis L. Osbeck) , 2006, Genetics and Molecular Biology.

[68]  W. Sakamoto,et al.  Chloroplast Biogenesis: Control of Plastid Development, Protein Import, Division and Inheritance , 2008, The arabidopsis book.

[69]  T. Sakurai,et al.  Genome sequence of the palaeopolyploid soybean , 2010, Nature.

[70]  Sudhir Kumar,et al.  MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment , 2004, Briefings Bioinform..

[71]  V. Solovyev,et al.  Automatic annotation of eukaryotic genes, pseudogenes and promoters , 2006, Genome Biology.

[72]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[73]  A. Fujiyama,et al.  A map of rice genome variation reveals the origin of cultivated rice , 2012, Nature.

[74]  J. Poulain,et al.  The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla , 2007, Nature.

[75]  F. Dosba Breeding plantation tree crops. Temperate species , 2009 .

[76]  Francisco M. De La Vega,et al.  Genomics for the world , 2011, Nature.

[77]  E. Goldschmidt,et al.  Biology of citrus , 1996 .