A tree-decomposition approach to protein structure prediction

This paper proposes a tree decomposition of protein structures, which can be used to efficiently solve two key subproblems of protein structure prediction: protein threading for backbone prediction and protein side-chain prediction. To develop a unified tree-decomposition based approach to these two subproblems, we model them as a geometric neighborhood graph labeling problem. Theoretically, we can have a low-degree polynomial time algorithm to decompose a geometric neighborhood graph G=(V,E) into components with size O(|V|/sup 2/spl bsol/3/ log|V|). The computational complexity of the tree-decomposition based graph labeling algorithms is O(|V|/spl Delta//sup tw+1/]) where /spl Delta/ is the average number of possible labels for each vertex and tw(=O(|V|/sup 2/3/ log|V|)) the tree width of G. Empirically, tw is very small and the tree-decomposition method can solve these two problems very efficiently. This paper also compares the computational efficiency of the tree-decomposition approach with the linear programming approach to these two problems and identifies the condition under which the tree-decomposition approach is more efficient than the linear programming approach. Experimental result indicates that the tree-decomposition approach is more efficient most of the time.

[1]  Ying Xu,et al.  Protein Threading by Linear Programming , 2003, Pacific Symposium on Biocomputing.

[2]  Pinar Heggernes,et al.  The Minimum Degree Heuristic and the Minimal Triangulation Process , 2003, WG.

[3]  Mona Singh,et al.  Solving and analyzing side-chain positioning problems using linear and integer programming , 2005, Bioinform..

[4]  Tatsuya Akutsu,et al.  Protein Side-chain Packing Problem: A Maximum Edge-weight Clique Algorithmic Approach , 2005, APBC.

[5]  Paul D. Seymour,et al.  Graph Minors. II. Algorithmic Aspects of Tree-Width , 1986, J. Algorithms.

[6]  Ying Xu,et al.  Raptor: Optimal Protein Threading by Linear Programming , 2003, J. Bioinform. Comput. Biol..

[7]  N. Grishin,et al.  Side‐chain modeling with an optimized scoring function , 2002, Protein science : a publication of the Protein Society.

[8]  David C. Jones,et al.  GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. , 1999, Journal of molecular biology.

[9]  B. Rost,et al.  Critical assessment of methods of protein structure prediction (CASP)—Round 6 , 2005, Proteins.

[10]  Adrian A Canutescu,et al.  Access the most recent version at doi: 10.1110/ps.03154503 References , 2003 .

[11]  E. Lindahl,et al.  Identification of related proteins on family, superfamily and fold level. , 2000, Journal of molecular biology.

[12]  Ying Xu,et al.  An Efficient Computational Method for Globally Optimal Threading , 1998, J. Comput. Biol..

[13]  Sheila A. McIlraith,et al.  Partition-based logical reasoning for first-order and propositional theories , 2005, Artif. Intell..

[14]  T. Hubbard,et al.  Critical assessment of methods of protein structure prediction (CASP): Round III , 1999 .

[15]  Michael I. Jordan,et al.  Thin Junction Trees , 2001, NIPS.

[16]  Jinbo Xu,et al.  Rapid Protein Side-Chain Packing via Tree Decomposition , 2005, RECOMB.

[17]  Mona Singh,et al.  A Semidefinite Programming Approach to Side Chain Positioning with New Rounding Strategies , 2004, INFORMS J. Comput..

[18]  Arne Elofsson,et al.  Side Chain-Positioning as an Integer Programming Problem , 2001, WABI.

[19]  Derek G. Corneil,et al.  Complexity of finding embeddings in a k -tree , 1987 .

[20]  Arie M. C. A. Koster,et al.  Solving frequency assignment problems via tree-decomposition , 1999 .

[21]  S. Bryant,et al.  Critical assessment of methods of protein structure prediction (CASP): Round II , 1997, Proteins.

[22]  Satoru Miyano,et al.  On the Approximation of Protein Threading , 1999, Theor. Comput. Sci..

[23]  Gary L. Miller,et al.  Separators for sphere-packings and nearest neighbor graphs , 1997, JACM.

[24]  Tatsuya Akutsu NP-Hardness Results for Protein Side-chain Packing , 1997 .

[25]  T. Hubbard,et al.  Critical assessment of methods of protein structure prediction (CASP): Round III , 1999, Proteins.

[26]  B. A. Reed,et al.  Algorithmic Aspects of Tree Width , 2003 .

[27]  Niles A Pierce,et al.  Protein design is NP-hard. , 2002, Protein engineering.

[28]  Paul D. Seymour,et al.  Graph Minors: XV. Giant Steps , 1996, J. Comb. Theory, Ser. B.

[29]  T. Hubbard,et al.  Critical assessment of methods of protein structure prediction (CASP)‐round V , 2003, Proteins.

[30]  Sheila A. McIlraith,et al.  Partition-Based Logical Reasoning , 2000, KR.

[31]  R. Lathrop The protein threading problem with sequence amino acid interaction preferences is NP-complete. , 1994, Protein engineering.

[32]  Richard H. Lathrop,et al.  A branch-and-bound algorithm for optimal protein threading with pairwise (contact potential) amino acid interactions , 1994, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.