From RNA-seq to large-scale genotyping - genomics resources for rye (Secale cereale L.)

BackgroundThe improvement of agricultural crops with regard to yield, resistance and environmental adaptation is a perpetual challenge for both breeding and research. Exploration of the genetic potential and implementation of genome-based breeding strategies for efficient rye (Secale cereale L.) cultivar improvement have been hampered by the lack of genome sequence information. To overcome this limitation we sequenced the transcriptomes of five winter rye inbred lines using Roche/454 GS FLX technology.ResultsMore than 2.5 million reads were assembled into 115,400 contigs representing a comprehensive rye expressed sequence tag (EST) resource. From sequence comparisons 5,234 single nucleotide polymorphisms (SNPs) were identified to develop the Rye5K high-throughput SNP genotyping array. Performance of the Rye5K SNP array was investigated by genotyping 59 rye inbred lines including the five lines used for sequencing, and five barley, three wheat, and two triticale accessions. A balanced distribution of allele frequencies ranging from 0.1 to 0.9 was observed. Residual heterozygosity of the rye inbred lines varied from 4.0 to 20.4% with higher average heterozygosity in the pollen compared to the seed parent pool.ConclusionsThe established sequence and molecular marker resources will improve and promote genetic and genomic research as well as genome-based breeding in rye.

[1]  Juan Miguel García-Gómez,et al.  BIOINFORMATICS APPLICATIONS NOTE Sequence analysis Manipulation of FASTQ data with Galaxy , 2005 .

[2]  D. Galbraith,et al.  Monitoring large-scale changes in transcript abundance in drought- and salt-stressed barley , 2004, Plant Molecular Biology.

[3]  Christian Schlötterer,et al.  Gene expression profiling by massively parallel sequencing. , 2007, Genome research.

[4]  J. Ko,et al.  Production of a new wheat line possessing the 1BL.1RS wheat-rye translocation derived from Korean rye cultivar Paldanghomil , 2002, Theoretical and Applied Genetics.

[5]  Yoshihiro Kawahara,et al.  The Rice Annotation Project Database (RAP-DB): 2008 update , 2007, Nucleic Acids Res..

[6]  Ying Li,et al.  De novo sequencing and analysis of the American ginseng root transcriptome using a GS FLX Titanium platform to discover putative genes involved in ginsenoside biosynthesis , 2010, BMC Genomics.

[7]  Pierre Sourdille,et al.  A Physical Map of the 1-Gigabase Bread Wheat Chromosome 3B , 2008, Science.

[8]  The Relationship of Heterozygosity to Homeostasis in Maize Hybrids. , 1959, Genetics.

[9]  M. Blaxter,et al.  Comparing de novo assemblers for 454 transcriptome data , 2010, BMC Genomics.

[10]  R. Schafleitner,et al.  A sweetpotato gene index established by de novo assembly of pyrosequencing and Sanger sequences and mining for gene-based microsatellite markers , 2010, BMC Genomics.

[11]  Peter Langridge,et al.  Construction of a rye cv. Blanco BAC library, and progress towards cloning the rye Alt3 aluminium [aluminum] tolerance gene , 2007 .

[12]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[13]  W Brad Barbazuk,et al.  Gene discovery and annotation using LCM-454 transcriptome sequencing. , 2006, Genome research.

[14]  Brandon S. Gaut,et al.  Evolutionary dynamics of grass genomes , 2002 .

[15]  A. Oliphant,et al.  A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). , 2002, Science.

[16]  Uwe Scholz,et al.  Gene Content and Virtual Gene Order of Barley Chromosome 1H1[C][W][OA] , 2009, Plant Physiology.

[17]  Asan,et al.  The genome of the cucumber, Cucumis sativus L. , 2009, Nature Genetics.

[18]  Gabor T. Marth,et al.  Whole-genome sequencing and variant discovery in C. elegans , 2008, Nature Methods.

[19]  Huanming Yang,et al.  A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. japonica) , 2002, Science.

[20]  Kazuo Shinozaki,et al.  TriFLDB: A Database of Clustered Full-Length Coding Sequences from Triticeae with Applications to Comparative Grass Genomics[C][W][OA] , 2009, Plant Physiology.

[21]  J. Messing,et al.  The 'inner circle' of the cereal genomes. , 2009, Current opinion in plant biology.

[22]  V. Korzun,et al.  Mapping of 99 new microsatellite-derived loci in rye (Secale cereale L.) including 39 expressed sequence tags , 2004, Theoretical and Applied Genetics.

[23]  Mihaela M. Martis,et al.  The Sorghum bicolor genome and the diversification of grasses , 2009, Nature.

[24]  R. B. Flavell,et al.  Genome size and the proportion of repeated nucleotide sequence DNA in plants , 1974, Biochemical Genetics.

[25]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[26]  Hiroaki Sakai,et al.  Comprehensive Sequence Analysis of 24,783 Barley Full-Length cDNAs Derived from 12 Clone Libraries1[W][OA] , 2011, Plant Physiology.

[27]  Robert Wagner,et al.  GabiPD: the GABI primary database—a plant integrative ‘omics’ database , 2008, Nucleic Acids Res..

[28]  R. Lister,et al.  Finding the fifth base: genome-wide sequencing of cytosine methylation. , 2009, Genome research.

[29]  Sai Guna Ranjan Gurazada,et al.  Genome sequencing and analysis of the model grass Brachypodium distachyon , 2010, Nature.

[30]  X. Huang,et al.  CAP3: A DNA sequence assembly program. , 1999, Genome research.

[31]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..

[32]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[33]  Jade Buchanan-Carter,et al.  Sequencing and de novo analysis of a coral larval transcriptome using 454 GSFlx , 2009, BMC Genomics.

[34]  Li Yang,et al.  MIPSPlantsDB—plant database resource for integrative and comparative plant genome research , 2007, Nucleic Acids Res..

[35]  Dawn H. Nagel,et al.  The B73 Maize Genome: Complexity, Diversity, and Dynamics , 2009, Science.

[36]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[37]  T. Wetter,et al.  Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs. , 2004, Genome research.

[38]  S Rozen,et al.  Primer3 on the WWW for general users and for biologist programmers. , 2000, Methods in molecular biology.

[39]  Malcolm J. Bowman,et al.  Proceedings of the Workshop , 1978 .

[40]  V. Korzun,et al.  A genetic map of rye (Secale cereale L.) combining RFLP, isozyme, protein, microsatellite and gene loci , 2001, Theoretical and Applied Genetics.

[41]  Christopher D Town,et al.  A first survey of the rye (Secale cereale) genome composition through BAC end sequencing of the short arm of chromosome 1R , 2008, BMC Plant Biology.

[42]  A. Limin,et al.  COLD HARDINESS OF FORAGE GRASSES GROWN ON THE CANADIAN PRAIRIES , 1987 .

[43]  A. Schulman Molecular markers to assess genetic diversity , 2007, Euphytica.

[44]  M. Platzer,et al.  A whole-genome snapshot of 454 sequences exposes the composition of the barley genome and provides evidence for parallel evolution of genome size in wheat and barley. , 2009, The Plant journal : for cell and molecular biology.

[45]  Gabor T. Marth,et al.  A general approach to single-nucleotide polymorphism discovery , 1999, Nature Genetics.

[46]  Uwe Scholz,et al.  De novo 454 sequencing of barcoded BAC pools for comprehensive gene survey and genome analysis in the complex genome of barley , 2009 .

[47]  G. Melz,et al.  Genetic linkage map of rye (Secale cereale L.) , 1992, Theoretical and Applied Genetics.

[48]  R. Varshney,et al.  Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.) , 2003, Theoretical and Applied Genetics.

[49]  Jan Gorodkin,et al.  454 pyrosequencing based transcriptome analysis of Zygaena filipendulae with focus on genes involved in biosynthesis of cyanogenic glucosides , 2009, BMC Genomics.

[50]  Xuehui Huang,et al.  Function annotation of the rice transcriptome at single-nucleotide resolution by RNA-seq. , 2010, Genome research.

[51]  M. Metzker Sequencing technologies — the next generation , 2010, Nature Reviews Genetics.

[52]  L Nardi,et al.  Plant Genome Size Estimation by Flow Cytometry: Inter-laboratory Comparison , 1998 .

[53]  Evandro Novaes,et al.  High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome , 2008, BMC Genomics.

[54]  S. Rudd,et al.  Comparative mapping of DNA sequences in rye (Secalecereale L.) in relation to the rice genome , 2008, Theoretical and Applied Genetics.

[55]  Dawei Li,et al.  The sequence and de novo assembly of the giant panda genome , 2010, Nature.

[56]  R. Durbin,et al.  Mapping Quality Scores Mapping Short Dna Sequencing Reads and Calling Variants Using P

, 2022 .

[57]  J. Jurka,et al.  Repeats in genomic DNA: mining and meaning. , 1998, Current opinion in structural biology.

[58]  K. Houchins,et al.  Molecular linkage mapping in rye (Secale cereale L.) , 2001, Theoretical and Applied Genetics.

[59]  S. Weissman,et al.  Construction of a uniform-abundance (normalized) cDNA library. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[60]  T Coche,et al.  Reducing bias in cDNA sequence representation by molecular selection. , 1994, Nucleic acids research.

[61]  R. Varshney,et al.  Molecular markers and their applications in wheat breeding , 1999 .

[62]  Timothy P. L. Smith,et al.  Development and Characterization of a High Density SNP Genotyping Assay for Cattle , 2009, PloS one.

[63]  Gabor T. Marth,et al.  EagleView: a genome assembly viewer for next-generation sequencing technologies. , 2008, Genome research.

[64]  G. Wricke,et al.  An extended genetic map of rye (Secale cereale L.) , 1996 .