MulBlast 1.0: a multiple alignment of BLAST output to boost protein sequence similarity analysis

The protein sequence similarity search has become a major tool for biologists. Various efficient and rapid programs and comparison matrices have been designed and refined in order to perform the scanning task (BLAST, FASTA, Automat, etc.). However, the final step of the search, the analysis of the results, is still tedious and time consuming. In order to optimize true-positive hit screening, we have developed a program which makes a multiple alignment from the BLAST search output. Conserved sequence segments are pointed out. It makes the recognition of already known as well as new sequence patterns easier. It allows at a glance a rapid identification of significant similarities, protein family signature and new sequence motifs. This alignment is written in a compatible format for the GCG programs LineUp and ProfileMake.

[1]  H Cantalloube,et al.  Automat and BLAST: comparison of two protein sequence similarity search programs , 1995, Comput. Appl. Biosci..

[2]  B. Rost,et al.  Prediction of protein secondary structure at better than 70% accuracy. , 1993, Journal of molecular biology.

[3]  S F Altschul,et al.  Protein database searches for multiple alignments. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[4]  C. Sander,et al.  Database of homology‐derived protein structures and the structural meaning of sequence alignment , 1991, Proteins.

[5]  M. Gribskov,et al.  [9] Profile analysis , 1990 .

[6]  K. H. Kalk,et al.  Crystal structure of haloalkane dehalogenase: an enzyme to detoxify halogenated alkanes. , 1991, The EMBO journal.

[7]  W. Pearson Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. , 1991, Genomics.

[8]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[9]  J. Sussman,et al.  Relationship between sequence conservation and three‐dimensional structure in a large family of esterases, lipases, and related proteins , 1993, Protein science : a publication of the Protein Society.

[10]  R. F. Smith,et al.  BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results. , 1995, Genome research.

[11]  S. Karlin,et al.  Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[12]  A. Bairoch,et al.  The SWISS-PROT protein sequence data bank. , 1991, Nucleic acids research.

[13]  G Labesse,et al.  Structural comparisons lead to the definition of a new superfamily of NAD(P)(H)-accepting oxidoreductases: the single-domain reductases/epimerases/dehydrogenases (the 'RED' family). , 1994, The Biochemical journal.

[14]  W. Pearson Rapid and sensitive sequence comparison with FASTP and FASTA. , 1990, Methods in enzymology.