Plant genome annotation methods.

Annotation of plant genomic sequences can be separated into structural and functional annotation. Structural annotation is the foundation of all genomics as without accurate gene models understanding gene function or evolution of genes across taxa can be impeded. Structural annotation is dependent on sensitive, specific computational programs and deep experimental evidence to identify gene features within genomic DNA. Functional annotation is highly dependent on sequence similarity to other known genes or proteins as the majority of initial "first-pass" functional annotation on a genomic scale is transitive. Coupling structural and functional annotation across genomes in a comparative manner promotes more accurate annotation as well as an understanding of gene and genome evolution. With the increasing availability of plant genome sequence data, the value of comparative annotation will increase. As with any new field, methodologies are evolving for genome annotation and will improve in the future.

[1]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[2]  M. Schmid,et al.  Genome-Wide Insertional Mutagenesis of Arabidopsis thaliana , 2003, Science.

[3]  B. Haas,et al.  Complete reannotation of the Arabidopsis genome: methods, tools, protocols and the final release , 2005, BMC Biology.

[4]  Ian Korf,et al.  MaskerAid : a performance enhancement to RepeatMasker , 2000, Bioinform..

[5]  Hong-Gyu Kang,et al.  Generation of a flanking sequence-tag database for activation-tagging lines in japonica rice. , 2006, The Plant journal : for cell and molecular biology.

[6]  S. Lewis,et al.  The generic genome browser: a building block for a model organism system database. , 2002, Genome research.

[7]  J. Jurka,et al.  Repbase Update, a database of eukaryotic repetitive elements , 2005, Cytogenetic and Genome Research.

[8]  Cathy H. Wu,et al.  InterPro, progress and status in 2005 , 2004, Nucleic Acids Res..

[9]  V. Solovyev,et al.  Ab initio gene finding in Drosophila genomic DNA. , 2000, Genome research.

[10]  James W. Fickett,et al.  The Gene Identification Problem: An Overview for Developers , 1995, Comput. Chem..

[11]  B. Haas,et al.  Sequencing Medicago truncatula expressed sequenced tags using 454 Life Sciences technology , 2006, BMC Genomics.

[12]  M. Borodovsky,et al.  Gene identification in novel eukaryotic genomes by self-training algorithm , 2005, Nucleic acids research.

[13]  P. Rouzé,et al.  Current methods of gene prediction, their strengths and weaknesses. , 2002, Nucleic acids research.

[14]  Emily Dimmer,et al.  GOA? - Use of Gene Ontology Annotation (GOA) for biological interpretation of '-omics' data and for validation of automatic annotation tools , 2004, Silico Biol..

[15]  David J. States,et al.  Identification of protein coding regions by database similarity search , 1993, Nature Genetics.

[16]  C. Robin Buell,et al.  The TIGR Plant Repeat Databases: a collective resource for the identification of repetitive sequences in plants , 2004, Nucleic Acids Res..

[17]  Kan Nobuta,et al.  Plant MPSS databases: signature-based transcriptional resources for analyses of mRNA and small RNA , 2005, Nucleic Acids Res..

[18]  Wei Zhu,et al.  The TIGR Plant Transcript Assemblies database , 2006, Nucleic Acids Res..

[19]  Patrick S. Schnable,et al.  Evaluation of five ab initio gene prediction programs for the discovery of maize genes , 2005, Plant Molecular Biology.

[20]  R. Wing,et al.  Efficient insertional mutagenesis in rice using the maize En/Spm elements. , 2005, The Plant journal : for cell and molecular biology.

[21]  Rolf Apweiler,et al.  InterProScan: protein domains identifier , 2005, Nucleic Acids Res..

[22]  S. Rhee,et al.  Functional Annotation of the Arabidopsis Genome Using Controlled Vocabularies1 , 2004, Plant Physiology.

[23]  Kim Rutherford,et al.  Artemis: sequence visualization and annotation , 2000, Bioinform..

[24]  Chul Min Kim,et al.  Rapid, large-scale generation of Ds transposant lines and analysis of the Ds insertion sites in rice. , 2004, The Plant journal : for cell and molecular biology.

[25]  P. Ouwerkerk,et al.  Early and multiple Ac transpositions in rice suitable for efficient insertional mutagenesis , 2001, Plant Molecular Biology.

[26]  C Robin Buell,et al.  Microarray expression profiling resources for plant genomics. , 2005, Trends in plant science.

[27]  O. Jaillon,et al.  Exploring root symbiotic programs in the model legume Medicago truncatula using EST analysis. , 2002, Nucleic acids research.

[28]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[29]  Shivakundan Singh Tej,et al.  Analysis of the transcriptional complexity of Arabidopsis thaliana by massively parallel signature sequencing , 2004, Nature Biotechnology.

[30]  Emily Dimmer,et al.  The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology , 2004, Nucleic Acids Res..

[31]  G. Pertea,et al.  Comparative Analyses of Potato Expressed Sequence Tag Libraries1 , 2003, Plant Physiology.