© 2012 Landes Bioscience. Do not distribute. WormBase Annotating many nematode genomes

WormBase (www.wormbase.org) has been serving the scientific community for over 11 years as the central repository for genomic and genetic information for the soil nematode Caenorhabditis elegans. The resource has evolved from its beginnings as a database housing the genomic sequence and genetic and physical maps of a single species, and now represents the breadth and diversity of nematode research, currently serving genome sequence and annotation for around 20 nematodes. In this article, we focus on WormBase's role of genome sequence annotation, describing how we annotate and integrate data from a growing collection of nematode species and strains. We also review our approaches to sequence curation, and discuss the impact on annotation quality of large functional genomics projects such as modENCODE.

[1]  L. Hillier,et al.  A global analysis of C. elegans trans-splicing. , 2011, Genome research.

[2]  Todd A. Ciche The biology and genome of Heterorhabditis bacteriophora. , 2007, WormBook : the online review of C. elegans biology.

[3]  Ali Mortazavi,et al.  Scaffolding a Caenorhabditis nematode genome with RNA-seq. , 2010, Genome research.

[4]  A. Pires-daSilva Evolution of the control of sexual identity in nematodes. , 2007, Seminars in cell & developmental biology.

[5]  Gaston H. Gonnet,et al.  OMA 2011: orthology inference among 1000 complete genomes , 2010, Nucleic Acids Res..

[6]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[7]  Keith Bradnam,et al.  CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes , 2007, Bioinform..

[8]  Burkhard Morgenstern,et al.  Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources , 2006, BMC Bioinformatics.

[9]  Paul Davis,et al.  Methods and strategies for gene structure curation in WormBase , 2011, Database J. Biol. Databases Curation.

[10]  R. Sommer,et al.  Phylogeny of the nematode genus Pristionchus and implications for biodiversity, biogeography and the evolution of hermaphroditism , 2007, BMC Evolutionary Biology.

[11]  Anurag Tripathi,et al.  2011 Landes Bioscience. Do not distribute. , 2011 .

[12]  Gabor T. Marth,et al.  Whole-genome sequencing and variant discovery in C. elegans , 2008, Nature Methods.

[13]  Daniel Rios,et al.  Ensembl 2011 , 2010, Nucleic Acids Res..

[14]  Varghese P. Thomas,et al.  Sequence and genetic map of Meloidogyne hapla: A compact nematode genome for plant parasitism , 2008, Proceedings of the National Academy of Sciences.

[15]  Sebastian D. Mackowiak,et al.  The Landscape of C. elegans 3′UTRs , 2010, Science.

[16]  Damian Szklarczyk,et al.  eggNOG v2.0: extending the evolutionary genealogy of genes with enhanced non-supervised orthologous groups, species and functional annotations , 2009, Nucleic Acids Res..

[17]  Paul W. Sternberg,et al.  Ascaris suum draft genome , 2011, Nature.

[18]  D. Koboldt,et al.  Caenorhabditis briggsae Recombinant Inbred Line Genotypes Reveal Inter-Strain Incompatibility and the Evolution of Recombination , 2011, PLoS genetics.

[19]  Christoph Dieterich,et al.  The Pristionchus pacificus genome provides a unique perspective on nematode lifestyle and parasitism , 2008, Nature Genetics.

[20]  S. Searle,et al.  The Ensembl analysis pipeline. , 2004, Genome research.

[21]  Steven J. M. Jones,et al.  Whole-Genome Profiling of Mutagenesis in Caenorhabditis elegans , 2010, Genetics.

[22]  Asif Chinwalla,et al.  Comparison of C. elegans and C. briggsae Genome Sequences Reveals Extensive Conservation of Chromosome Organization and Synteny , 2007, PLoS biology.

[23]  Ke Wang,et al.  genBlastG: using BLAST searches to build homologous gene models , 2011, Bioinform..

[24]  E. Birney,et al.  Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. , 2008, Genome research.

[25]  P. Boag,et al.  Molecular aspects of sexual development and reproduction in nematodes and schistosomes. , 2001, Advances in parasitology.

[26]  E. Haag The evolution of nematode sex determination: C. elegans as a reference point for comparative biology. , 2005, WormBook : the online review of C. elegans biology.

[27]  R. Durbin,et al.  The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative Genomics , 2003, PLoS biology.

[28]  D. Bartel,et al.  Formation, Regulation and Evolution of Caenorhabditis elegans 3′UTRs , 2010, Nature.

[29]  G. Cochrane,et al.  The International Nucleotide Sequence Database Collaboration , 2011, Nucleic Acids Res..

[30]  K. Futai,et al.  Chromosome structure and behaviour in Bursaphelenchus xylophilus (Nematoda: Parasitaphelenchidae) germ cells and early embryo , 2006 .

[31]  Raymond K. Auerbach,et al.  Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project , 2010, Science.

[32]  Sydney Brenner,et al.  A uniform genetic nomenclature for the nematode Caenorhabditis elegans , 1979, Molecular and General Genetics MGG.

[33]  Tao Liu,et al.  TreeFam: a curated database of phylogenetic trees of animal gene families , 2005, Nucleic Acids Res..

[34]  Andrew Smith Genome sequence of the nematode C-elegans: A platform for investigating biology , 1998 .

[35]  Erik L. L. Sonnhammer,et al.  InParanoid 7: new algorithms and tools for eukaryotic orthology analysis , 2009, Nucleic Acids Res..

[36]  E. Danchin,et al.  The genomes of root-knot nematodes. , 2009, Annual review of phytopathology.

[37]  Guy Cochrane,et al.  The International Nucleotide Sequence Database Collaboration , 2011, Nucleic Acids Res..

[38]  S. Jarriault,et al.  A Strategy for Direct Mapping and Identification of Mutations by Whole-Genome Sequencing , 2010, Genetics.

[39]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[40]  Christian Braendle,et al.  A phylogeny and molecular barcodes for Caenorhabditis, with numerous new species from rotting fruits , 2011, BMC Evolutionary Biology.

[41]  M. Viney A genetic analysis of reproduction in Strongyloides ratti , 1994, Parasitology.

[42]  Anushya Muruganujan,et al.  PANTHER: a browsable database of gene products organized by biological function, using curated protein family and subfamily classification , 2003, Nucleic Acids Res..

[43]  Anushya Muruganujan,et al.  PANTHER version 7: improved phylogenetic trees, orthologs and collaboration with the Gene Ontology Consortium , 2009, Nucleic Acids Res..

[44]  Graziano Pesole,et al.  Genome sequence of the metazoan plant-parasitic nematode Meloidogyne incognita , 2008, Nature Biotechnology.

[45]  Alejandro Sanchez-Flores,et al.  Genomic Insights into the Origin of Parasitism in the Emerging Plant Pathogen Bursaphelenchus xylophilus , 2011, PLoS pathogens.

[46]  Feng Chen,et al.  OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups , 2005, Nucleic Acids Res..

[47]  J. Berg Genome sequence of the nematode C. elegans: a platform for investigating biology. , 1998, Science.

[48]  Paul Flicek,et al.  eHive: An Artificial Intelligence workflow system for genomic analysis , 2010, BMC Bioinformatics.

[49]  Jonathan E. Allen,et al.  Draft Genome of the Filarial Nematode Parasite Brugia malayi , 2007, Science.

[50]  Oliver Hobert,et al.  Analysis of Multiple Ethyl Methanesulfonate-Mutagenized Caenorhabditis elegans Strains by Whole-Genome Sequencing , 2010, Genetics.

[51]  M. Gerstein,et al.  Unlocking the secrets of the genome , 2009, Nature.

[52]  Albert J. Vilella,et al.  EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. , 2009, Genome research.

[53]  G. Saunders,et al.  Genetics of Mating and Sex Determination in the Parasitic Nematode Haemonchus contortus , 2008, Genetics.

[54]  Kimberly Van Auken,et al.  WormBase 2012: more genomes, more data, new website , 2011, Nucleic Acids Res..

[55]  Kimberly Van Auken,et al.  WormBase: a comprehensive resource for nematode research , 2009, Nucleic Acids Res..

[56]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[57]  P. Green,et al.  Massively parallel sequencing of the polyadenylated transcriptome of C. elegans. , 2009, Genome research.

[58]  I. Pe’er,et al.  Caenorhabditis elegans mutant allele identification by whole-genome sequencing , 2008, Nature Methods.

[59]  Elaine R. Mardis,et al.  The draft genome of the parasitic nematode Trichinella spiralis , 2011, Nature Genetics.

[60]  S. Batzoglou,et al.  Distribution and intensity of constraint in mammalian genomic sequence. , 2005, Genome research.

[61]  Mark L. Blaxter,et al.  A molecular evolutionary framework for the phylum Nematoda , 1998, Nature.