iPBA: a tool for protein structure comparison using sequence alignment strategies

With the immense growth in the number of available protein structures, fast and accurate structure comparison has been essential. We propose an efficient method for structure comparison, based on a structural alphabet. Protein Blocks (PBs) is a widely used structural alphabet with 16 pentapeptide conformations that can fairly approximate a complete protein chain. Thus a 3D structure can be translated into a 1D sequence of PBs. With a simple Needleman–Wunsch approach and a raw PB substitution matrix, PB-based structural alignments were better than many popular methods. iPBA web server presents an improved alignment approach using (i) specialized PB Substitution Matrices (SM) and (ii) anchor-based alignment methodology. With these developments, the quality of ∼88% of alignments was improved. iPBA alignments were also better than DALI, MUSTANG and GANGSTA+ in >80% of the cases. The webserver is designed to for both pairwise comparisons and database searches. Outputs are given as sequence alignment and superposed 3D structures displayed using PyMol and Jmol. A local alignment option for detecting subs-structural similarity is also embedded. As a fast and efficient ‘sequence-based’ structure comparison tool, we believe that it will be quite useful to the scientific community. iPBA can be accessed at http://www.dsimb.inserm.fr/dsimb_tools/ipba/.

[1]  K Henrick,et al.  Electronic Reprint Biological Crystallography Secondary-structure Matching (ssm), a New Tool for Fast Protein Structure Alignment in Three Dimensions Biological Crystallography Secondary-structure Matching (ssm), a New Tool for Fast Protein Structure Alignment in Three Dimensions , 2022 .

[2]  J F Gibrat,et al.  Surprising similarities in structure comparison. , 1996, Current opinion in structural biology.

[3]  Narayanan Eswar,et al.  Alignment of multiple protein structures based on sequence and structure features. , 2009, Protein engineering, design & selection : PEDS.

[4]  Peter J. Stuckey,et al.  Structural search and retrieval using a tableau representation of protein folding patterns , 2008, Bioinform..

[5]  A. D. McLachlan,et al.  Rapid comparison of protein structures , 1982 .

[6]  Adam Godzik,et al.  Using an alignment of fragment strings for comparing protein structures , 2007, Bioinform..

[7]  Zong Hong Zhang,et al.  deconSTRUCT: general purpose protein database search on the substructure level , 2010, Nucleic Acids Res..

[8]  Narayanaswamy Srinivasan,et al.  Protein Block Expert (PBE): a web-based protein structure analysis server using a structural alphabet , 2006, Nucleic Acids Res..

[9]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[10]  M Tyagi,et al.  Protein structure mining using a structural alphabet , 2008, Proteins.

[11]  Narayanaswamy Srinivasan,et al.  Structure-Based Phylogeny as a Diagnostic for Functional Characterization of Proteins with a Cupin Fold , 2009, PloS one.

[12]  C. Etchebest,et al.  A structural alphabet for local protein structures: Improved prediction methods , 2005, Proteins.

[13]  Liisa Holm,et al.  DaliLite workbench for protein structure comparison , 2000, Bioinform..

[14]  Nick V. Grishin,et al.  HorA web server to infer homology between proteins using sequence and structural similarity , 2009, Nucleic Acids Res..

[15]  A. Konagurthu,et al.  MUSTANG: A multiple structural alignment algorithm , 2006, Proteins.

[16]  Roberto Mosca,et al.  RAPIDO: a web server for the alignment of protein structures in the presence of conformational changes , 2008, Nucleic Acids Res..

[17]  John P. Overington,et al.  HOMSTRAD: A database of protein structure alignments for homologous families , 1998, Protein science : a publication of the Protein Society.

[18]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[19]  Jiaan Yang Comprehensive description of protein structures using protein folding shape code , 2008, Proteins: Structure, Function, and Bioinformatics.

[20]  Andrew E. Torda,et al.  The SALAMI protein structure search server , 2009, Nucleic Acids Res..

[21]  Sheng Wang,et al.  ClEPaps: Fast Pair Alignment of protein Structures Based on conformational Letters , 2007, J. Bioinform. Comput. Biol..

[22]  Nick V. Grishin,et al.  ProSMoS server: a pattern-based search using interaction matrix representation of protein structures , 2009, Nucleic Acids Res..

[23]  N. Srinivasan,et al.  A substitution matrix for structural alphabet based on structural alignment of homologous proteins and its applications , 2006, Proteins.

[24]  Aysam Guerler,et al.  Novel protein folds and their nonsequential structural analogs , 2008, Protein science : a publication of the Protein Society.

[25]  Alexandre G. de Brevern,et al.  New assessment of a structural alphabet , 2005, Silico Biol..

[26]  Jinn-Moon Yang,et al.  Kappa-alpha plot derived structural alphabet and BLOSUM-like substitution matrix for rapid search of protein structure database , 2007, Genome Biology.

[27]  J. Skolnick,et al.  TM-align: a protein structure alignment algorithm based on the TM-score , 2005, Nucleic acids research.

[28]  Bohdan Schneider,et al.  A short survey on protein blocks , 2010, Biophysical Reviews.

[29]  Adam Zemla,et al.  LGA: a method for finding 3D similarities in protein structures , 2003, Nucleic Acids Res..

[30]  S. Balaji,et al.  PALI - a database of Phylogeny and ALIgnment of homologous protein structures , 2001, Nucleic Acids Res..

[31]  N. Srinivasan,et al.  Improvement of protein structure comparison using a structural alphabet. , 2011, Biochimie.

[32]  C. Etchebest,et al.  Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks , 2000, Proteins.

[33]  H. Wolfson,et al.  Flexible protein alignment and hinge detection , 2002, Proteins.

[34]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[35]  Jacquelyn S. Fetrow,et al.  Structural genomics and its importance for gene function analysis , 2000, Nature Biotechnology.

[36]  Raffaello Potestio,et al.  ALADYN: a web server for aligning proteins by matching their large-scale motion , 2010, Nucleic Acids Res..

[37]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[38]  W. Miller,et al.  A time-efficient, linear-space local similarity algorithm , 1991 .

[39]  Adam Godzik,et al.  Flexible structure alignment by chaining aligned fragment pairs allowing twists , 2003, ECCB.

[40]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[41]  Markus Porto,et al.  SABERTOOTH: protein structural alignment based on a vectorial structure representation , 2007, BMC Bioinformatics.

[42]  Manfred J. Sippl,et al.  On distance and similarity in fold space , 2008, Bioinform..