NemaFootPrinter: a web based software for the identification of conserved non-coding genome sequence regions between C. elegans and C. briggsae

BackgroundNemaFootPrinter (Nematode Transcription Factor Scan Through Philogenetic Footprinting) is a web-based software for interactive identification of conserved, non-exonic DNA segments in the genomes of C. elegans and C. briggsae. It has been implemented according to the following project specifications:a) Automated identification of orthologous gene pairs.b) Interactive selection of the boundaries of the genes to be compared.c) Pairwise sequence comparison with a range of different methods.d) Identification of putative transcription factor binding sites on conserved, non-exonic DNA segments.ResultsStarting from a C. elegans or C. briggsae gene name or identifier, the software identifies the putative ortholog (if any), based on information derived from public nematode genome annotation databases. The investigator can then retrieve the genome DNA sequences of the two orthologous genes; visualize graphically the genes' intron/exon structure and the surrounding DNA regions; select, through an interactive graphical user interface, subsequences of the two gene regions. Using a bioinformatics toolbox (Blast2seq, Dotmatcher, Ssearch and connection to the rVista database) the investigator is able at the end of the procedure to identify and analyze significant sequences similarities, detecting the presence of transcription factor binding sites corresponding to the conserved segments. The software automatically masks exons.DiscussionThis software is intended as a practical and intuitive tool for the researchers interested in the identification of non-exonic conserved sequence segments between C. elegans and C. briggsae. These sequences may contain regulatory transcriptional elements since they are conserved between two related, but rapidly evolving genomes. This software also highlights the power of genome annotation databases when they are conceived as an open resource and the possibilities offered by seamless integration of different web services via the http protocol.Availability: the program is freely available at http://bio.ifom-firc.it/NTFootPrinter

[1]  M. Gerstein,et al.  Of mice and men: phylogenetic footprinting aids the discovery of regulatory elements , 2003, Journal of biology.

[2]  S. W. Emmons,et al.  Analysis of the constancy of DNA sequences during development and evolution of the nematode Caenorhabditis elegans. , 1979, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Graziano Pesole,et al.  CSTminer: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison , 2004, Nucleic Acids Res..

[4]  Jens Stoye,et al.  Benchmarking tools for the alignment of functional noncoding DNA , 2004, BMC Bioinformatics.

[5]  Oliver Hobert,et al.  CisOrtho: A program pipeline for genome-wide identification of transcription factor target genes using phylogenetic footprinting , 2004, BMC Bioinformatics.

[6]  Graziano Pesole,et al.  DG-CST (Disease Gene Conserved Sequence Tags), a database of human–mouse conserved elements associated to disease genes , 2004, Nucleic Acids Res..

[7]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[8]  Damian Smedley,et al.  Ensembl 2005 , 2004, Nucleic Acids Res..

[9]  W. Pearson Comparison of methods for searching protein sequence databases , 1995, Protein science : a publication of the Protein Society.

[10]  B. M. Jackson,et al.  Identification of evolutionarily conserved promoter elements and amino acids required for function of the C. elegans beta-catenin homolog BAR-1. , 2004, Developmental biology.

[11]  E. Birney,et al.  EnsMart: a generic system for fast and flexible access to biological data. , 2003, Genome research.

[12]  Morris F. Maduro,et al.  Conservation of function and expression of unc-119 from two Caenorhabditis species despite divergence of non-coding DNA. , 1996, Gene.

[13]  R. Durbin,et al.  The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative Genomics , 2003, PLoS biology.

[14]  Ivan Ovcharenko,et al.  rVISTA 2.0: evolutionary analysis of transcription factor binding sites , 2004, Nucleic Acids Res..

[15]  Paul W. Sternberg,et al.  WormBase: network access to the genome and biology of Caenorhabditis elegans , 2001, Nucleic Acids Res..

[16]  Anjana Rao,et al.  Bioinformatics for the 'bench biologist': how to find regulatory regions in genomic DNA , 2004, Nature Immunology.

[17]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[18]  D. Haussler,et al.  Human-mouse alignments with BLASTZ. , 2003, Genome research.

[19]  D E Geraghty,et al.  Data acquisition, data storage, and data presentation in a modern genetics laboratory. , 2000, Reviews in immunogenetics.

[20]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[21]  Thomas L. Madden,et al.  BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. , 1999, FEMS microbiology letters.

[22]  Sydney Brenner,et al.  A uniform genetic nomenclature for the nematode Caenorhabditis elegans , 1979, Molecular and General Genetics MGG.

[23]  Kimberly Van Auken,et al.  WormBase: a comprehensive data resource for Caenorhabditis biology and genomics , 2004, Nucleic Acids Res..

[24]  Mathieu Blanchette,et al.  Algorithms for phylogenetic footprinting , 2001, RECOMB.

[25]  V. Nigon,et al.  Reproductive patterns and attempts at reciprocal crossing of Rhabditis elegans Maupas, 1900, and Rhabditis briggsae Dougherty and Nigon, 1949 (Nematoda: Rhabditidae). , 1949, The Journal of experimental zoology.