Revealing divergent evolution, identifying circular permutations and detecting active-sites by protein structure comparison

BackgroundProtein structure comparison is one of the most important problems in computational biology and plays a key role in protein structure prediction, fold family classification, motif finding, phylogenetic tree reconstruction and protein docking.ResultsWe propose a novel method to compare the protein structures in an accurate and efficient manner. Such a method can be used to not only reveal divergent evolution, but also identify circular permutations and further detect active-sites. Specifically, we define the structure alignment as a multi-objective optimization problem, i.e., maximizing the number of aligned atoms and minimizing their root mean square distance. By controlling a single distance-related parameter, theoretically we can obtain a variety of optimal alignments corresponding to different optimal matching patterns, i.e., from a large matching portion to a small matching portion. The number of variables in our algorithm increases with the number of atoms of protein pairs in almost a linear manner. In addition to solid theoretical background, numerical experiments demonstrated significant improvement of our approach over the existing methods in terms of quality and efficiency. In particular, we show that divergent evolution, circular permutations and active-sites (or structural motifs) can be identified by our method. The software SAMO is available upon request from the authors, or from http://zhangroup.aporc.org/bioinfo/samo/ and http://intelligent.eic.osaka-sandai.ac.jp/chenen/samo.htm.ConclusionA novel formulation is proposed to accurately align protein structures in the framework of multi-objective optimization, based on a sequence order-independent strategy. A fast and accurate algorithm based on the bipartite matching algorithm is developed by exploiting the special features. Convergence of computation is shown in experiments and is also theoretically proven.

[1]  Patrick L. Brockett,et al.  Mathematical Programming for Operations Researchers and Computer Scientists. , 1982 .

[2]  David Kohler Mathematical programming: For operations researchers and computer scientists: Marcel Dekker, New York, 1981, xii + 373 pages, Sfr.132.00 , 1983 .

[3]  Albert G. Holzman Mathematical Programming for Operations Researchers and Computer Scientists , 1986, IEEE Transactions on Reliability.

[4]  K. S. Arun,et al.  Least-Squares Fitting of Two 3-D Point Sets , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[6]  R. Nussinov,et al.  Three‐dimensional, sequence order‐independent structural comparison of a serine protease against the crystallographic database reveals active site similarities: Potential implications to evolution and to protein folding , 1994, Protein science : a publication of the Protein Society.

[7]  S. Bryant,et al.  Statistics of sequence-structure threading. , 1995, Current opinion in structural biology.

[8]  Tatsuya Akutsu,et al.  Protein Structure Alignment Using Dynamic Programing and Iterative Improvement , 1996 .

[9]  W R Taylor,et al.  SSAP: sequential structure alignment program for protein structure comparison. , 1996, Methods in enzymology.

[10]  Mark Gerstein,et al.  Using Iterative Dynamic Programming to Obtain Accurate Pairwise and Multiple Alignments of Protein Structures , 1996, ISMB.

[11]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[12]  Amihood Amir,et al.  A simple algorithm for detecting circular permutations in proteins , 1999, Bioinform..

[13]  J. Szustakowski,et al.  Protein structure alignment using a genetic algorithm , 2000, Proteins.

[14]  Hiroyuki Toh,et al.  A local structural alignment method that accommodates with circular permutation , 2001 .

[15]  J. Jung,et al.  Circularly permuted proteins in the protein structure database , 2001, Protein science : a publication of the Protein Society.

[16]  Jie Liang,et al.  Inferring functional relationships of proteins from local sequence and spatial surface patterns. , 2003, Journal of molecular biology.

[17]  Mattias Ohlsson,et al.  Matching protein structures with fuzzy alignments , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Alexander Schrijver,et al.  Combinatorial optimization. Polyhedra and efficiency. , 2003 .

[19]  Nathan Linial,et al.  Approximate protein structural alignment in polynomial time. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[20]  B. DasGupta,et al.  Order independent structural alignment of circularly permuted proteins , 2004, The 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[21]  Luonan Chen,et al.  Aligning Multiple Protein Structures by Deterministic Annealing , 2005, J. Bioinform. Comput. Biol..

[22]  Xiang-Sun Zhang,et al.  Comparison of protein structures by multi-objective optimization. , 2005, Genome informatics. International Conference on Genome Informatics.

[23]  Luonan Chen,et al.  Protein structure alignment by deterministic annealing , 2005, Bioinform..