BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations

BAliBASE is specifically designed to serve as an evaluation resource to address all the problems encountered when aligning complete sequences. The database contains high quality, manually constructed multiple sequence alignments together with detailed annotations. The alignments are all based on three-dimensional structural superpositions, with the exception of the transmembrane sequences. The first release provided sets of reference alignments dealing with the problems of high variability, unequal repartition and large N/C-terminal extensions and internal insertions. Here we describe version 2.0 of the database, which incorporates three new reference sets of alignments containing structural repeats, trans-membrane sequences and circular permutations to evaluate the accuracy of detection/prediction and alignment of these complex sequences. BAliBASE can be viewed at the web site http://www-igbmc.u-strasbg. fr/BioInfo/BAliBASE2/index.html or can be downloaded from ftp://ftp-igbmc.u-strasbg.fr/pub/BAliBASE2 /.

[1]  Erich Bornberg-Bauer,et al.  Computational approaches to identify Leucine Zippers , 1998, German Conference on Bioinformatics.

[2]  Pietro Liò,et al.  Wavelet change-point prediction of transmembrane proteins , 2000, Bioinform..

[3]  Amihood Amir,et al.  A simple algorithm for detecting circular permutations in proteins , 1999, Bioinform..

[4]  M M Gromiha A simple method for predicting transmembrane alpha helices with better accuracy. , 1999, Protein engineering.

[5]  C. Ponting,et al.  Homology-based method for identification of protein repeats using statistical significance estimates. , 2000, Journal of molecular biology.

[6]  M. Michael Gromiha,et al.  A simple method for predicting transmembrane α helices with better accuracy , 1999 .

[7]  S J Hamodrakas,et al.  A novel method for predicting transmembrane segments in proteins based on a statistical analysis of the SwissProt database: the PRED-TMR algorithm. , 1999, Protein engineering.

[8]  Olivier Poch,et al.  BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs , 1999, Bioinform..

[9]  E. Marcotte,et al.  A fast algorithm for genome‐wide analysis of proteins with repeated sequences , 1999, Proteins.

[10]  P. Argos,et al.  A method to recognize distant repeats in protein sequences , 1993, Proteins.

[11]  G J Kleywegt,et al.  Where freedom is given, liberties are taken. , 1995, Structure.

[12]  A. D. McLachlan,et al.  Profile analysis: detection of distantly related proteins. , 1987, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Olivier Poch,et al.  A comprehensive comparison of multiple sequence alignment programs , 1999, Nucleic Acids Res..

[14]  Shigeki Mitaku,et al.  SOSUI: classification and secondary structure prediction system for membrane proteins , 1998, Bioinform..

[15]  J. Thompson,et al.  DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches. , 2000, Nucleic acids research.

[16]  M Kanehisa,et al.  Prediction of membrane proteins based on classification of transmembrane segments. , 1998, Protein engineering.

[17]  A Elofsson,et al.  Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method. , 1997, Protein engineering.