GWFASTA: server for FASTA search in eukaryotic and microbial genomes.

Similarity searches are a powerful method for solving important biological problems such as database scanning, evolutionary studies, gene prediction, and protein structure prediction. FASTA is a widely used sequence comparison tool for rapid database scanning. Here we describe the GWFASTA server that was developed to assist the FASTA user in similarity searches against partially and/or completely sequenced genomes. GWFASTA consists of more than 60 microbial genomes, eight eukaryote genomes, and proteomes of annotatedgenomes. Infact, it provides the maximum number of databases for similarity searching from a single platform. GWFASTA allows the submission of more than one sequence as a single query for a FASTA search. It also provides integrated post-processing of FASTA output, including compositional analysis of proteins, multiple sequences alignment, and phylogenetic analysis. Furthermore, it summarizes the search results organism-wise for prokaryotes and chromosome-wise for eukaryotes. Thus, the integration of different tools for sequence analyses makes GWFASTA a powerful toolfor biologists.

[1]  K. Nishikawa,et al.  Protein surface amino acid compositions distinctively differ between thermophilic and mesophilic bacteria. , 2001, Journal of molecular biology.

[2]  P. Romeo,et al.  Molecular characterization of a novel gene family (PHTF) conserved from Drosophila to mammals. , 2000, Genomics.

[3]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Pankaj Agarwal,et al.  Comparative accuracy of methods for protein sequence similarity search , 1998, Bioinform..

[5]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[6]  K. Nishikawa,et al.  Classification of proteins into groups based on amino acid composition and other characters. II. Grouping into four types. , 1983, Journal of biochemistry.

[7]  K Nishikawa,et al.  Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies. , 1994, Journal of molecular biology.

[8]  Andy Brass,et al.  Searching DNA databases for similarities to DNA sequences: when is a match significant? , 1998, Bioinform..

[9]  G. P. S. Raghava,et al.  A Graphical Web Server for the Analysis of Protein Sequences and Alignment , 2001 .

[10]  C Combet,et al.  NPS@: network protein sequence analysis. , 2000, Trends in biochemical sciences.

[11]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[12]  J. Colinge,et al.  MetaBlasts: tracing protein tyrosine phosphatase gene family roots from Man to Drosophila melanogaster and Caenorhabditis elegans genomes. , 2000, Gene.

[13]  C Sander,et al.  Prediction of protein structure by evaluation of sequence-structure fitness. Aligning sequences to contact profiles derived from three-dimensional structures. , 1993, Journal of molecular biology.

[14]  Webb Miller,et al.  Comparison of genomic DNA sequences: solved and unsolved problems , 2001, Bioinform..

[15]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[16]  J. Gogarten,et al.  Orthologs, paralogs and genome comparisons. , 1999, Current opinion in genetics & development.

[17]  Geoffrey J. Barton,et al.  Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation , 1993, Comput. Appl. Biosci..

[18]  J. Garnier,et al.  Improving protein secondary structure prediction with aligned homologous sequences , 1996, Protein science : a publication of the Protein Society.

[19]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[20]  William R. Pearson,et al.  Aligning a DNA sequence with a protein sequence , 1997, RECOMB '97.

[21]  G J Barton,et al.  Application of multiple sequence alignment profiles to improve protein secondary structure prediction , 2000, Proteins.

[22]  W. Pearson Comparison of methods for searching protein sequence databases , 1995, Protein science : a publication of the Protein Society.

[23]  Chris Sander,et al.  MView: a web-compatible database search or multiple alignment viewer , 1998, Bioinform..