The emerging biofuel crop Camelina sativa retains a highly undifferentiated hexaploid genome structure

Camelina sativa is an oilseed with desirable agronomic and oil-quality attributes for a viable industrial oil platform crop. Here we generate the first chromosome-scale high-quality reference genome sequence for C. sativa and annotated 89,418 protein-coding genes, representing a whole-genome triplication event relative to the crucifer model Arabidopsis thaliana. C. sativa represents the first crop species to be sequenced from lineage I of the Brassicaceae. The well-preserved hexaploid genome structure of C. sativa surprisingly mirrors those of economically important amphidiploid Brassica crop species from lineage II as well as wheat and cotton. The three genomes of C. sativa show no evidence of fractionation bias and limited expression-level bias, both characteristics commonly associated with polyploid evolution. The highly undifferentiated polyploid genome of C. sativa presents significant consequences for breeding and genetic manipulation of this industrial oil crop.

[1]  J. Bennetzen,et al.  Plant retrotransposons. , 1999, Annual review of genetics.

[2]  C. Eynck,et al.  Diseases of Camelina sativa (false flax) , 2009 .

[3]  M. Lysak,et al.  Chromosomal Phylogeny and Karyotype Evolution in x=7 Crucifer Species (Brassicaceae)[W] , 2008, The Plant Cell Online.

[4]  Guillaume Blanc,et al.  Functional Divergence of Duplicated Genes Formed by Polyploidy during Arabidopsis Evolution , 2004, The Plant Cell Online.

[5]  M. Clements,et al.  Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana , 2010, Proceedings of the National Academy of Sciences.

[6]  B. Morgenstern,et al.  AUGUSTUS at EGASP: using EST, protein and genomic alignments for improved gene prediction in the human genome , 2006, Genome Biology.

[7]  M. Beilstein,et al.  Polyploid genome of Camelina sativa revealed by isolation of fatty acid synthesis genes , 2010, BMC Plant Biology.

[8]  Margaret J. Robertson,et al.  Design and Analysis of Experiments , 2006, Handbook of statistics.

[9]  Michael Freeling,et al.  Genomic duplication, fractionation and the origin of regulatory novelty. , 2004, Genetics.

[10]  K. H. Wolfe,et al.  When gene marriages don't work out: divorce by subfunctionalization. , 2007, Trends in genetics : TIG.

[11]  Scott Jackson,et al.  Genomic and expression plasticity of polyploidy. , 2010, Current opinion in plant biology.

[12]  T. Sakurai,et al.  Genome sequence of the palaeopolyploid soybean , 2010, Nature.

[13]  B. Dujon The yeast genome project: what did we learn? , 1996, Trends in genetics : TIG.

[14]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..

[15]  D. Kahn,et al.  The Relationship among Gene Expression, the Evolution of Gene Dosage, and the Rate of Protein Evolution , 2010, PLoS genetics.

[16]  David M. A. Martin,et al.  Genome sequence and analysis of the tuber crop potato , 2011, Nature.

[17]  Brian C. Thomas,et al.  Following tetraploidy in an Arabidopsis ancestor, genes were removed preferentially from one homeolog leaving clusters enriched in dose-sensitive genes. , 2006, Genome research.

[18]  S. Salzberg,et al.  Hierarchical scaffolding with Bambus. , 2003, Genome research.

[19]  Martin Krzywinski,et al.  Fast Diploidization in Close Mesopolyploid Relatives of Arabidopsis[W][OA] , 2010, Plant Cell.

[20]  L. Lukens,et al.  Segmental Structure of the Brassica napus Genome Based on Comparative Analysis With Arabidopsis thaliana , 2005, Genetics.

[21]  E. Kellogg,et al.  Brassicaceae phylogeny inferred from phytochrome A and ndhF sequence data: tribes and trichomes revisited. , 2008, American journal of botany.

[22]  P. Kersey,et al.  Analysis of the bread wheat genome using whole genome shotgun sequencing , 2012, Nature.

[23]  J. Bennetzen,et al.  Mechanisms and rates of genome expansion and contraction in flowering plants , 2002, Genetica.

[24]  James C. Schnable,et al.  Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss , 2011, Proceedings of the National Academy of Sciences.

[25]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[26]  David Sankoff,et al.  The collapse of gene complement following whole genome duplication , 2010, BMC Genomics.

[27]  B. Moser Biodiesel from alternative oilseed feedstocks: camelina and field pennycress , 2012 .

[28]  Nu Genome analysis in Brassica with special reference to the experimental formation of B. napus and peculiar mode of fertilization. , 1935 .

[29]  Jian Wang,et al.  SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler , 2012, GigaScience.

[30]  W. Friedt,et al.  Genetic mapping of agronomic traits in false flax (Camelina sativa subsp. sativa). , 2006, Genome.

[31]  N. Friedman,et al.  Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2011, Nature Biotechnology.

[32]  Sofia M. C. Robb,et al.  MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. , 2007, Genome research.

[33]  Yuji Suzuki,et al.  RNA isolation from siliques, dry seeds, and other tissues of Arabidopsis thaliana. , 2004, BioTechniques.

[34]  I. Al‐Shehbaz,et al.  BrassiBase: Tools and biological resources to study characters and traits in the Brassicaceae—version 1.1 , 2012 .

[35]  Steven Salzberg,et al.  DAGchainer: a tool for mining segmental genome duplications and synteny , 2004, Bioinform..

[36]  Mihaela M. Martis,et al.  The Sorghum bicolor genome and the diversification of grasses , 2009, Nature.

[37]  P. Etter,et al.  Rapid SNP Discovery and Genetic Mapping Using Sequenced RAD Markers , 2008, PloS one.

[38]  Ryan A. Rapp,et al.  Evolutionary genetics of genome merger and doubling in plants. , 2008, Annual review of genetics.

[39]  Xiaowu Wang,et al.  Deciphering the Diploid Ancestral Genome of the Mesohexaploid Brassica rapa[C][W] , 2013, Plant Cell.

[40]  Richard M. Clark,et al.  The Arabidopsis lyrata genome sequence and the basis of rapid genome size change , 2011, Nature Genetics.

[41]  J. Ohlrogge,et al.  Acyl-Lipid Metabolism , 2013, The arabidopsis book.

[42]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[43]  J. Poulain,et al.  The genome of the mesopolyploid crop species Brassica rapa , 2011, Nature Genetics.

[44]  Stephen M. Mount,et al.  Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. , 2003, Nucleic acids research.

[45]  T. Mitchell-Olds,et al.  The ABC's of comparative genomics in the Brassicaceae: building blocks of crucifer genomes. , 2006, Trends in plant science.

[46]  G. Bonnema,et al.  Biased Gene Fractionation and Dominant Gene Expression among the Subgenomes of Brassica rapa , 2012, PloS one.

[47]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[48]  James C. Schnable,et al.  Altered Patterns of Fractionation and Exon Deletions in Brassica rapa Support a Two-Step Model of Paleohexaploidy , 2012, Genetics.

[49]  V. Solovyev,et al.  Ab initio gene finding in Drosophila genomic DNA. , 2000, Genome research.