OP-PCPJ140183 1..11

Recent developments in DNA sequencing have enabled the large and complex genomes of many crop species to be determined for the first time, even those previously intractable due to their polyploid nature. Indeed, over the course of the last 2 years, the genome sequences of several commercially important cereals, notably barley and bread wheat, have become available, as well as those of related wild species. While still incomplete, comparison with other, more completely assembled species suggests that coverage of genic regions is likely to be high. Ensembl Plants (http://plants.ensembl.org) is an integrative resource organizing, analyzing and visualizing genome-scale information for important crop and model plants. Available data include reference genome sequence, variant loci, gene models and functional annotation. For variant loci, individual and population genotypes, linkage information and, where available, phenotypic information are shown. Comparative analyses are performed on DNA and protein sequence alignments. The resulting genome alignments and gene trees, representing the implied evolutionary history of the gene family, are made available for visualization and analysis. Driven by the case of bread wheat, specific extensions to the analysis pipelines and web interface have recently been developed to support polyploid genomes. Data in Ensembl Plants is accessible through a genome browser incorporating various specialist interfaces for different data types, and through a variety of additional methods for programmatic access and data mining. These interfaces are consistent with those offered through the Ensembl interface for the genomes of non-plant species, including those of plant pathogens, pests and pollinators, facilitating the study of the plant in its environment.

[1]  Yadan Luo,et al.  Aegilops tauschii draft genome sequence reveals a gene repertoire for wheat adaptation , 2013, Nature.

[2]  Wenlong Yang,et al.  Draft genome of the wheat A-genome progenitor Triticum urartu , 2013, Nature.

[3]  A. Shimizu,et al.  Discovery of High-Confidence Single Nucleotide Polymorphisms from Large-Scale De Novo Analysis of Leaf Transcripts of Aegilops tauschii, A Wild Wheat Progenitor , 2012, DNA research : an international journal for rapid publication of reports on genes and genomes.

[4]  Matthew Fraser,et al.  InterProScan 5: genome-scale protein function classification , 2014, Bioinform..

[5]  Kaworu Ebana,et al.  HapRice, an SNP haplotype database and a web tool for rice. , 2014, Plant & cell physiology.

[6]  Keith J. Edwards,et al.  CerealsDB 2.0: an integrated resource for plant breeders and scientists , 2012, BMC Bioinformatics.

[7]  Morten Lillemo,et al.  Characterization of polyploid wheat genomic diversity using a high-density 90 000 single nucleotide polymorphism array , 2014, Plant biotechnology journal.

[8]  Serban Nacu,et al.  Fast and SNP-tolerant detection of complex variants and splicing in short reads , 2010, Bioinform..

[9]  Maria Keays,et al.  ArrayExpress update—trends in database growth and links to data analysis tools , 2012, Nucleic Acids Res..

[10]  Dawn H. Nagel,et al.  The B73 Maize Genome: Complexity, Diversity, and Dynamics , 2009, Science.

[11]  D. Haussler,et al.  Evolution's cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[12]  A. Akhunova,et al.  Homoeolog-specific transcriptional bias in allopolyploid wheat , 2010, BMC Genomics.

[13]  Albert J. Vilella,et al.  EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. , 2009, Genome research.

[14]  Hadi Quesneville,et al.  Structural and functional partitioning of bread wheat chromosome 3B , 2014, Science.

[15]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[16]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[17]  Arek Kasprzyk,et al.  BioMart: driving a paradigm change in biological data management , 2011, Database J. Biol. Databases Curation.

[18]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[19]  Robert S. Harris,et al.  Improved pairwise alignment of genomic dna , 2007 .

[20]  Justin Preece,et al.  De Novo Transcriptome Assembly and Analyses of Gene Expression during Photomorphogenesis in Diploid Wheat Triticum monococcum , 2014, PloS one.

[21]  Iain M. Wallace,et al.  M-Coffee: combining multiple sequence alignment methods with T-Coffee , 2006, Nucleic acids research.

[22]  Kengo Kinoshita,et al.  ATTED-II in 2014: Evaluation of Gene Coexpression in Agriculturally Important Plants , 2014, Plant & cell physiology.

[23]  J. Doležel,et al.  Flow cytometric chromosome sorting in plants: the next generation. , 2012, Methods.

[24]  Dan M. Bolser,et al.  Ensembl Genomes 2013: scaling up access to genome-wide data , 2013, Nucleic Acids Res..

[25]  Mihaela M. Martis,et al.  A physical, genetic and functional sequence assembly of the barley genome , 2012, Nature.

[26]  María Martín,et al.  Activities at the Universal Protein Resource (UniProt) , 2013, Nucleic Acids Res..

[27]  Nuno A. Fonseca,et al.  Expression Atlas update—a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments , 2013, Nucleic Acids Res..

[28]  Sam Griffiths-Jones,et al.  Annotating non-coding RNAs with Rfam. , 2005, Current protocols in bioinformatics.

[29]  V. Solovyev,et al.  Ab initio gene finding in Drosophila genomic DNA. , 2000, Genome research.

[30]  Huanming Yang,et al.  A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. indica) , 2002, Science.

[31]  Christophe Dessimoz,et al.  Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs , 2012, PLoS Comput. Biol..

[32]  Daniel Rios,et al.  Bioinformatics Applications Note Databases and Ontologies Deriving the Consequences of Genomic Variants with the Ensembl Api and Snp Effect Predictor , 2022 .

[33]  Temple F. Smith,et al.  Prediction of gene structure. , 1992, Journal of molecular biology.

[34]  Daniel R. Zerbino,et al.  Ensembl 2014 , 2013, Nucleic Acids Res..

[35]  J. Batley,et al.  A chromosome-based draft sequence of the hexaploid bread wheat (Triticum aestivum) genome , 2014, Science.

[36]  S. Eddy,et al.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. , 1997, Nucleic acids research.

[37]  L Nardi,et al.  Plant Genome Size Estimation by Flow Cytometry: Inter-laboratory Comparison , 1998 .

[38]  Jan Bartoš,et al.  Chromosome-based genomics in the cereals , 2007, Chromosome Research.

[39]  P. Kersey,et al.  Analysis of the bread wheat genome using whole genome shotgun sequencing , 2012, Nature.

[40]  Ying Cheng,et al.  The European Nucleotide Archive , 2010, Nucleic Acids Res..

[41]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology , 2003, Nucleic Acids Res..

[42]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[43]  Nuclear DNA amounts in angiosperms , 1982, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[44]  K. Katoh,et al.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. , 2002, Nucleic acids research.

[45]  Ewan Birney,et al.  Automated generation of heuristics for biological sequence comparison , 2005, BMC Bioinformatics.

[46]  A. Oliphant,et al.  A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). , 2002, Science.

[47]  J. Poland,et al.  Development of High-Density Genetic Maps for Barley and Wheat Using a Novel Two-Enzyme Genotyping-by-Sequencing Approach , 2012, PloS one.

[48]  Peter F. Hallin,et al.  RNAmmer: consistent and rapid annotation of ribosomal RNA genes , 2007, Nucleic acids research.

[49]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[50]  J. Chapman,et al.  Anchoring and ordering NGS contig assemblies by population sequencing (POPSEQ) , 2013, The Plant journal : for cell and molecular biology.

[51]  D. Zohary,et al.  Distribution of Wild Wheats and Barley , 1966, Science.

[52]  Dan M. Bolser,et al.  Gramene 2013: comparative plant genomics resources , 2013, Nucleic Acids Res..

[53]  Stefano Lonardi,et al.  An Improved Consensus Linkage Map of Barley Based on Flow‐Sorted Chromosomes and Single Nucleotide Polymorphism Markers , 2011 .

[54]  D. Edwards,et al.  WheatGenome.info: an integrated database and portal for wheat genome information. , 2012, Plant & cell physiology.

[55]  Barry Smith,et al.  The Plant Ontology as a Tool for Comparative Plant Anatomy and Genomic Analyses , 2012, Plant & cell physiology.

[56]  R. Gwilliam,et al.  Transcript-specific, single-nucleotide polymorphism discovery and linkage analysis in hexaploid bread wheat (Triticum aestivum L.). , 2011, Plant biotechnology journal.

[57]  Hiroaki Sakai,et al.  Comprehensive Sequence Analysis of 24,783 Barley Full-Length cDNAs Derived from 12 Clone Libraries1[W][OA] , 2011, Plant Physiology.