GeneViTo: Visualizing gene-product functional and structural features in genomic datasets

BackgroundThe availability of increasing amounts of sequence data from completely sequenced genomes boosts the development of new computational methods for automated genome annotation and comparative genomics. Therefore, there is a need for tools that facilitate the visualization of raw data and results produced by bioinformatics analysis, providing new means for interactive genome exploration. Visual inspection can be used as a basis to assess the quality of various analysis algorithms and to aid in-depth genomic studies.ResultsGeneViTo is a JAVA-based computer application that serves as a workbench for genome-wide analysis through visual interaction. The application deals with various experimental information concerning both DNA and protein sequences (derived from public sequence databases or proprietary data sources) and meta-data obtained by various prediction algorithms, classification schemes or user-defined features. Interaction with a Graphical User Interface (GUI) allows easy extraction of genomic and proteomic data referring to the sequence itself, sequence features, or general structural and functional features. Emphasis is laid on the potential comparison between annotation and prediction data in order to offer a supplement to the provided information, especially in cases of "poor" annotation, or an evaluation of available predictions. Moreover, desired information can be output in high quality JPEG image files for further elaboration and scientific use. A compilation of properly formatted GeneViTo input data for demonstration is available to interested readers for two completely sequenced prokaryotes, Chlamydia trachomatis and Methanococcus jannaschii.ConclusionsGeneViTo offers an inspectional view of genomic functional elements, concerning data stemming both from database annotation and analysis tools for an overall analysis of existing genomes. The application is compatible with Linux or Windows ME-2000-XP operating systems, provided that the appropriate Java Runtime Environment is already installed in the system.

[1]  Theodore D. Liakopoulos,et al.  A novel tool for the prediction of transmembrane protein topology based on a statistical analysis of the SwissProt database: the OrienTM algorithm. , 2001, Protein engineering.

[2]  Nikos Kyrpides,et al.  Genomes OnLine Database (GOLD): a monitor of genome projects world-wide , 2001, Nucleic Acids Res..

[3]  Kim Rutherford,et al.  Artemis: sequence visualization and annotation , 2000, Bioinform..

[4]  Martin C. Frith,et al.  SeqVISTA: a graphical tool for sequence feature visualization and comparison , 2003, BMC Bioinformatics.

[5]  Rolf Apweiler,et al.  The EBI SRS Server: Recent Developments , 2002, German Conference on Bioinformatics.

[6]  S. Brunak,et al.  SHORT COMMUNICATION Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites , 1997 .

[7]  Claude Pasquier,et al.  PRED‐CLASS: Cascading neural networks for generalized protein classification and genome‐wide applications , 2001, Proteins.

[8]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 , 2000, Nucleic Acids Res..

[9]  Maria Jesus Martin,et al.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 , 2003, Nucleic Acids Res..

[10]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[11]  Natalia Ivanova,et al.  The ERGOTM genome analysis and discovery system , 2003, Nucleic Acids Res..

[12]  Philip Lijnzaad,et al.  The Ensembl genome database project , 2002, Nucleic Acids Res..

[13]  S J Hamodrakas,et al.  An hierarchical artificial neural network system for the classification of transmembrane proteins. , 1999, Protein engineering.

[14]  Michael Y. Galperin,et al.  The COG database: new developments in phylogenetic classification of proteins from complete genomes , 2001, Nucleic Acids Res..

[15]  Roderic Guigó,et al.  Gff2ps: Visualizing Genomic Annotations , 2000, Bioinform..

[16]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[17]  David L. Wheeler,et al.  GenBank , 2015, Nucleic Acids Res..

[18]  G. Schuler,et al.  Entrez: molecular biology database and retrieval system. , 1996, Methods in enzymology.