RNA-Seq analysis and annotation of a draft blueberry genome assembly identifies candidate genes involved in fruit ripening, biosynthesis of bioactive compounds, and stage-specific alternative splicing

BackgroundBlueberries are a rich source of antioxidants and other beneficial compounds that can protect against disease. Identifying genes involved in synthesis of bioactive compounds could enable the breeding of berry varieties with enhanced health benefits.ResultsToward this end, we annotated a previously sequenced draft blueberry genome assembly using RNA-Seq data from five stages of berry fruit development and ripening. Genome-guided assembly of RNA-Seq read alignments combined with output from ab initio gene finders produced around 60,000 gene models, of which more than half were similar to proteins from other species, typically the grape Vitis vinifera. Comparison of gene models to the PlantCyc database of metabolic pathway enzymes identified candidate genes involved in synthesis of bioactive compounds, including bixin, an apocarotenoid with potential disease-fighting properties, and defense-related cyanogenic glycosides, which are toxic. Cyanogenic glycoside (CG) biosynthetic enzymes were highly expressed in green fruit, and a candidate CG detoxification enzyme was up-regulated during fruit ripening. Candidate genes for ethylene, anthocyanin, and 400 other biosynthetic pathways were also identified. Homology-based annotation using Blast2GO and InterPro assigned Gene Ontology terms to around 15,000 genes. RNA-Seq expression profiling showed that blueberry growth, maturation, and ripening involve dynamic gene expression changes, including coordinated up- and down-regulation of metabolic pathway enzymes and transcriptional regulators. Analysis of RNA-seq alignments identified developmentally regulated alternative splicing, promoter use, and 3′ end formation.ConclusionsWe report genome sequence, gene models, functional annotations, and RNA-Seq expression data that provide an important new resource enabling high throughput studies in blueberry.

[1]  A. Rodriguez-Mateos,et al.  Dietary (poly)phenolics in human health: structures, bioavailability, and evidence of protective effects against chronic diseases. , 2013, Antioxidants & redox signaling.

[2]  D. Costich,et al.  Determination of ploidy level and nuclear DNA content in blueberry by flow cytometry , 1993, Theoretical and Applied Genetics.

[3]  Yuan-yuan Dong,et al.  De novo sequencing and comparative analysis of the blueberry transcriptome to discover putative genes related to antioxidants. , 2012, Gene.

[4]  Alexandros Stamatakis,et al.  Understanding Angiosperm Diversification Using Small and Large Phylogenetic Trees 1 , 2022 .

[5]  Patterns of simple sequence repeats in cultivated blueberries (Vaccinium section Cyanococcus spp.) and their use in revealing genetic diversity and population structure , 2014, Molecular Breeding.

[6]  K. Koch,et al.  Carbon and Nitrogen Economy of Developing Rabbiteye Blueberry Fruit , 1992 .

[7]  D. Bhattacharya,et al.  The American cranberry: first insights into the whole genome of a species adapted to bog habitat , 2014, BMC Plant Biology.

[8]  Nobutaka Mitsuda,et al.  Functional analysis of transcription factors in Arabidopsis. , 2009, Plant & cell physiology.

[9]  G. Grant,et al.  The C-4 stereochemistry of leucocyanidin substrates for anthocyanidin synthase affects product selectivity. , 2003, Bioorganic & medicinal chemistry letters.

[10]  F. Khodagholi,et al.  Natural Products as Promising Drug Candidates for the Treatment of Alzheimer’s Disease: Molecular Mechanism Aspect , 2013, Current neuropharmacology.

[11]  E. Rimm,et al.  Relative impact of flavonoid composition, dose and structure on vascular function: a systematic review of randomised controlled trials of flavonoid-rich food products. , 2012, Molecular nutrition & food research.

[12]  J. B. Magee,et al.  Resveratrol, pterostilbene, and piceatannol in vaccinium berries. , 2004, Journal of agricultural and food chemistry.

[13]  Grier P Page,et al.  CressExpress: A Tool For Large-Scale Mining of Expression Data from Arabidopsis1[W][OA] , 2008, Plant Physiology.

[14]  B. Scheffler,et al.  Novel transgenic rice overexpressing anthocyanidin synthase accumulates a mixture of flavonoids leading to an increased antioxidant potential. , 2007, Metabolic engineering.

[15]  Ziv Bar-Joseph,et al.  STEM: a tool for the analysis of short time series gene expression data , 2006, BMC Bioinformatics.

[16]  I. Raskin,et al.  Hypoglycemic activity of a novel anthocyanin-rich formulation from lowbush blueberry, Vaccinium angustifolium Aiton. , 2009, Phytomedicine : international journal of phytotherapy and phytopharmacology.

[17]  J. Ozga,et al.  Gene Expression and Metabolite Profiling of Developing Highbush Blueberry Fruit Indicates Transcriptional Regulation of Flavonoid Metabolism and Activation of Abscisic Acid Metabolism1[W][OA] , 2011, Plant Physiology.

[18]  Burkhard Morgenstern,et al.  AUGUSTUS: a web server for gene finding in eukaryotes , 2004, Nucleic Acids Res..

[19]  E. Gilbert,et al.  Recent advances in understanding the anti-diabetic actions of dietary flavonoids. , 2013, The Journal of nutritional biochemistry.

[20]  Steven Salzberg,et al.  TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders , 2004, Bioinform..

[21]  T. Kawada,et al.  Bixin regulates mRNA expression involved in adipogenesis and enhances insulin sensitivity in 3T3-L1 adipocytes through PPARgamma activation. , 2009, Biochemical and biophysical research communications.

[22]  Juan Miguel García-Gómez,et al.  Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research , 2005, Bioinform..

[23]  Xu Li,et al.  Forward Genetics by Genome Sequencing Reveals That Rapid Cyanide Release Deters Insect Herbivory of Sorghum bicolor , 2013, Genetics.

[24]  Mark Borodovsky,et al.  Eukaryotic Gene Prediction Using GeneMark.hmm‐E and GeneMark‐ES , 2011, Current protocols in bioinformatics.

[25]  K. Bible,et al.  Annatto constituent cis-bixin has selective antimyeloma effects mediated by oxidative stress and associated with inhibition of thioredoxin and thioredoxin reductase. , 2010, Antioxidants & redox signaling.

[26]  Hiral Vora,et al.  Mining Arabidopsis thaliana RNA-seq data with Integrated Genome Browser reveals stress-induced alternative splicing of the putative splicing regulator SR45a. , 2012, American journal of botany.

[27]  C. Akoh,et al.  Absorption of anthocyanins from blueberry extracts by caco-2 human intestinal cell monolayers. , 2006, Journal of agricultural and food chemistry.

[28]  V. Orsat,et al.  Blueberries and Their Anthocyanins: Factors Affecting Biosynthesis and Properties , 2011 .

[29]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[30]  A. Loraine,et al.  Transcriptional Coordination of the Metabolic Network in Arabidopsis1[W][OA] , 2006, Plant Physiology.

[31]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[32]  P. Shewry,et al.  Crop production science in horticulture series. , 2001, Plant Growth Regulation.

[33]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[34]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[35]  C. Schofield,et al.  Structure and mechanism of anthocyanidin synthase from Arabidopsis thaliana. , 2002, Structure.

[36]  C. Olsen,et al.  Evolution of heteromeric nitrilase complexes in Poaceae with new functions in nitrile metabolism , 2007, Proceedings of the National Academy of Sciences.

[37]  S. Rohn,et al.  Phenolic profile and antioxidant activity of highbush blueberry (Vaccinium corymbosum L.) during fruit maturation and ripening , 2008 .

[38]  D. Barceloux Cyanogenic foods (cassava, fruit kernels, and cycad seeds). , 2009, Disease-a-month : DM.

[39]  A. Loraine,et al.  Efficient quantification of the health-relevant anthocyanin and phenolic acid profiles in commercial cultivars and breeding selections of blueberries ( Vaccinium spp.). , 2013, Journal of agricultural and food chemistry.

[40]  David C. Tank,et al.  An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: , 2009 .

[41]  Robert D. Finn,et al.  InterPro in 2011: new developments in the family and domain prediction database , 2011, Nucleic acids research.

[42]  I. Erlund,et al.  Bioavailability of Quercetin From Berries and the Diet , 2006, Nutrition and cancer.

[43]  Peter D. Karp,et al.  Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology , 2015, Briefings Bioinform..

[44]  R. Prior,et al.  Anthocyanins: Structural characteristics that result in unique metabolic patterns and biological activities , 2006, Free radical research.

[45]  Anton Nekrutenko,et al.  Manipulation of FASTQ data with Galaxy , 2010, Bioinform..

[46]  A. Loraine,et al.  RNA-Seq of Arabidopsis Pollen Uncovers Novel Transcription and Alternative Splicing1[C][W][OA] , 2013, Plant Physiology.

[47]  Cole Trapnell,et al.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions , 2013, Genome Biology.

[48]  J. Poulain,et al.  The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla , 2007, Nature.

[49]  Dr David Vauzour Dietary Polyphenols as Modulators of Brain Functions: Biological Actions and Molecular Mechanisms Underpinning Their Beneficial Effects , 2012, Oxidative medicine and cellular longevity.

[50]  W. Gruissem,et al.  Fruits: A Developmental Perspective. , 1993, The Plant cell.

[51]  A. Loraine,et al.  Identification of Cytokinin-Responsive Genes Using Microarray Meta-Analysis and RNA-Seq in Arabidopsis1[C][W][OA] , 2013, Plant Physiology.

[52]  S. Rankin,et al.  Bioautography and chemical characterization of antimicrobial compound(s) in commercial water-soluble annatto extracts. , 2005, Journal of agricultural and food chemistry.

[53]  Ann E Loraine,et al.  Prevalence of alternative splicing choices in Arabidopsis thaliana , 2010, BMC Plant Biology.

[54]  A. Hohtola,et al.  Expression of Genes Involved in Anthocyanin Biosynthesis in Relation to Anthocyanin, Proanthocyanidin, and Flavonol Levels during Bilberry Fruit Development1 , 2002, Plant Physiology.

[55]  M. Diederich,et al.  Quercetin downregulates Mcl-1 by acting on mRNA stability and protein degradation , 2011, British Journal of Cancer.

[56]  P. Karp,et al.  Creation of a Genome-Wide Metabolic Pathway Database for Populus trichocarpa Using a New Approach for Reconstruction and Curation of Metabolic Pathways for Plants1[W][OA] , 2010, Plant Physiology.

[57]  J. Vetter Plant cyanogenic glycosides. , 2000, Toxicon : official journal of the International Society on Toxinology.

[58]  Matthew D. Young,et al.  Gene ontology analysis for RNA-seq: accounting for selection bias , 2010, Genome Biology.

[59]  B. Møller,et al.  Dhurrin Synthesis in Sorghum Is Regulated at the Transcriptional Level and Induced by Nitrogen Fertilization in Older Plants1 , 2002, Plant Physiology.

[60]  Chengying Shi,et al.  Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds , 2011, BMC Genomics.

[61]  N. Bassil,et al.  Generation and analysis of blueberry transcriptome sequences from leaves, developing fruit, and flower buds from cold acclimation through deacclimation , 2012, BMC Plant Biology.

[62]  T. McGhie,et al.  The bioavailability and absorption of anthocyanins: towards a better understanding. , 2007, Molecular nutrition & food research.

[63]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..

[64]  Patrik R. Jones,et al.  Resistance to an Herbivore Through Engineered Cyanogenic Glucoside Synthesis , 2001, Science.

[65]  Tatiana A. Tatusova,et al.  NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy , 2011, Nucleic Acids Res..

[66]  Gary Williamson,et al.  Bioavailability and bioefficacy of polyphenols in humans. I. Review of 97 bioavailability studies. , 2005, The American journal of clinical nutrition.

[67]  V. Kumar,et al.  Molecular characterization of bixin - an important industrial product. , 2010 .

[68]  L. Mueller,et al.  Designing a transcriptome next-generation sequencing project for a nonmodel plant species. , 2012, American journal of botany.

[69]  B. Matthews,et al.  Major differences observed in transcript profiles of blueberry during cold acclimation under field and cold room conditions , 2007, Planta.

[70]  Henry D. Priest,et al.  Genome-wide mapping of alternative splicing in Arabidopsis thaliana. , 2010, Genome research.