The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools

The Arabidopsis Information Resource (TAIR, http://arabidopsis.org) is a genome database for Arabidopsis thaliana, an important reference organism for many fundamental aspects of biology as well as basic and applied plant biology research. TAIR serves as a central access point for Arabidopsis data, annotates gene function and expression patterns using controlled vocabulary terms, and maintains and updates the A. thaliana genome assembly and annotation. TAIR also provides researchers with an extensive set of visualization and analysis tools. Recent developments include several new genome releases (TAIR8, TAIR9 and TAIR10) in which the A. thaliana assembly was updated, pseudogenes and transposon genes were re-annotated, and new data from proteomics and next generation transcriptome sequencing were incorporated into gene models and splice variants. Other highlights include progress on functional annotation of the genome and the release of several new tools including Textpresso for Arabidopsis which provides the capability to carry out full text searches on a large body of research literature.

[1]  Thomas Schiex,et al.  EUGÈNE: An Eukaryotic Gene Finder That Combines Several Sources of Evidence , 2000, JOBIM.

[2]  Rolf Apweiler,et al.  InterProScan - an integration platform for the signature-recognition methods in InterPro , 2001, Bioinform..

[3]  E. Birney,et al.  Apollo: a sequence annotation editor , 2002, Genome Biology.

[4]  J. Hays Arabidopsis thaliana, a versatile model system for study of eukaryotic genome-maintenance functions. , 2002, DNA repair.

[5]  Stephen M. Mount,et al.  Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. , 2003, Nucleic acids research.

[6]  B. Haas,et al.  Complete reannotation of the Arabidopsis genome: methods, tools, protocols and the final release , 2005, BMC Biology.

[7]  Hans-Michael Müller,et al.  Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological Literature , 2004, PLoS biology.

[8]  N. Alexandrov,et al.  Features of Arabidopsis Genes and Genome Discovered using Full-length cDNAs , 2005, Plant Molecular Biology.

[9]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..

[10]  L. Stein,et al.  Plant Ontology (PO): a Controlled Vocabulary of Plant Structures and Growth Stages , 2005, Comparative and functional genomics.

[11]  Mark Gerstein,et al.  PseudoPipe: an automated pseudogene identification pipeline , 2006, Bioinform..

[12]  J. Thierry-Mieg,et al.  AceView: a comprehensive cDNA-supported gene and transcripts annotation , 2006, Genome Biology.

[13]  Burkhard Morgenstern,et al.  AUGUSTUS: ab initio prediction of alternative transcripts , 2006, Nucleic Acids Res..

[14]  Wen-Hsiung Li,et al.  A large number of novel coding small open reading frames in the intergenic regions of the Arabidopsis thaliana genome are transcribed and/or under purifying selection. , 2007, Genome research.

[15]  M. Martin-Magniette,et al.  Analysis of CATMA transcriptome data identifies hundreds of novel functional genes and improves gene models in the Arabidopsis genome , 2007, BMC Genomics.

[16]  Tao Liu,et al.  A cross-species alignment tool (CAT) , 2007, BMC Bioinformatics.

[17]  A. van Belkum,et al.  Disease induction by human microbial pathogens in plant-model systems: potential, problems and prospects. , 2007, Drug discovery today.

[18]  S. Brunak,et al.  Locating proteins in the cell using TargetP, SignalP and related tools , 2007, Nature Protocols.

[19]  Celine A. Hayden,et al.  Identification of novel conserved peptide uORF homology groups in Arabidopsis and rice reveals ancient eukaryotic origin of select groups and preferential association with transcription factor-encoding genes , 2007, BMC Biology.

[20]  P. Zimmermann,et al.  Katja Baerenfaller Models and Proteome Dynamics Gene Arabidopsis thaliana Genome-Scale Proteomics Reveals , 2012 .

[21]  H. Quesneville,et al.  Improved detection and annotation of transposable elements in sequenced genomes using multiple reference sequence sets. , 2008, Genomics.

[22]  R. Lister,et al.  Highly Integrated Single-Base Resolution Maps of the Epigenome in Arabidopsis , 2008, Cell.

[23]  Samuel H. Payne,et al.  Discovery and revision of Arabidopsis genes by proteogenomics , 2008, Proceedings of the National Academy of Sciences.

[24]  Richard M. Clark,et al.  Sequencing of natural strains of Arabidopsis thaliana with short reads. , 2008, Genome research.

[25]  Tyler W. H. Backman,et al.  Update of ASRP: the Arabidopsis Small RNA Project database , 2007, Nucleic Acids Res..

[26]  Alan M. Jones,et al.  The Impact of Arabidopsis on Human Health: Diversifying Our Portfolio , 2008, Cell.

[27]  Huey-Ling Kao,et al.  Browsing Multidimensional Molecular Networks with the Generic Network Browser (N‐Browse) , 2008, Current protocols in bioinformatics.

[28]  Tanya Z. Berardini,et al.  The Arabidopsis Information Resource (TAIR): gene structure and function annotation , 2007, Nucleic Acids Res..

[29]  Li Ni,et al.  The Gene Ontology's Reference Genome Project: A Unified Framework for Functional Annotation across Species , 2009, PLoS Comput. Biol..

[30]  Ann E. Loraine,et al.  The Integrated Genome Browser: free software for distribution and exploration of genome-scale datasets , 2009, Bioinform..

[31]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[32]  Kimberly Van Auken,et al.  Semi-automated curation of protein subcellular localization: a text mining-based approach to Gene Ontology (GO) Cellular Component curation , 2009, BMC Bioinformatics.

[33]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[34]  J. Stajich,et al.  Using the Generic Synteny Browser (GBrowse_syn) , 2010, Current protocols in bioinformatics.

[35]  Irina M. Armean,et al.  The IntAct molecular interaction database in 2010 , 2009, Nucleic Acids Res..

[36]  Henry D. Priest,et al.  Genome-wide mapping of alternative splicing in Arabidopsis thaliana. , 2010, Genome research.

[37]  P. Karp,et al.  Creation of a Genome-Wide Metabolic Pathway Database for Populus trichocarpa Using a New Approach for Reconstruction and Curation of Metabolic Pathways for Plants1[W][OA] , 2010, Plant Physiology.

[38]  Giorgio Valle,et al.  The Gene Ontology in 2010: extensions and refinements , 2009, Nucleic Acids Res..

[39]  M. Koornneef,et al.  The development of Arabidopsis as a model plant. , 2010, The Plant journal : for cell and molecular biology.

[40]  M. Kimmel,et al.  Conflict of interest statement. None declared. , 2010 .

[41]  Weng-Keen Wong,et al.  Gene expression Advance Access publication April 21, 2010 Supersplat—spliced RNA-seq alignment , 2009 .

[42]  C. Buell,et al.  Twenty-First Century Plant Biology: Impacts of the Arabidopsis Genome on Plant Biology and Agriculture , 2010, Plant Physiology.

[43]  X. Xu,et al.  The value of Arabidopsis research in understanding human disease states. , 2011, Current opinion in biotechnology.

[44]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2011 update , 2010, Nucleic Acids Res..

[45]  N. Schlaich Arabidopsis thaliana- the model plant to study host-pathogen interactions. , 2011, Current drug targets.

[46]  Yin Hoon Chew,et al.  A stress-free walk from Arabidopsis to crops. , 2011, Current opinion in biotechnology.

[47]  W. Gruissem,et al.  pep2pro: a new tool for comprehensive proteome data analysis to reveal information about organ-specific proteomes in Arabidopsis thaliana. , 2011, Integrative biology : quantitative biosciences from nano to macro.

[48]  Y. Helariutta,et al.  Arabidopsis as a model for wood formation. , 2011, Current opinion in biotechnology.

[49]  Adi Avni,et al.  Can plant biotechnology help in solving our food and energy shortage in the future? , 2011, Current opinion in biotechnology.

[50]  Christie S. Chang,et al.  The BioGRID interaction database: 2013 update , 2012, Nucleic Acids Res..