Approximate Inference and Protein-Folding

Side-chain prediction is an important subtask in the protein-folding problem. We show that finding a minimal energy side-chain configuration is equivalent to performing inference in an undirected graphical model. The graphical model is relatively sparse yet has many cycles. We used this equivalence to assess the performance of approximate inference algorithms in a real-world setting. Specifically we compared belief propagation (BP), generalized BP (GBP) and naive mean field (MF). In cases where exact inference was possible, max-product BP always found the global minimum of the energy (except in few cases where it failed to converge), while other approximation algorithms of similar complexity did not. In the full protein data set, max-product BP always found a lower energy configuration than the other algorithms, including a widely used protein-folding software (SCWRL).

[1]  William T. Freeman,et al.  Learning to Estimate Scenes from Images , 1998, NIPS.

[2]  J. Mendes,et al.  Improvement of side-chain modeling in proteins with the self-consistent mean field theory method based on an analysis of the factors influencing prediction. , 1999, Biopolymers.

[3]  Brendan J. Frey,et al.  Very loopy belief propagation for unwrapping phase images , 2001, NIPS.

[4]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[5]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[6]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[7]  William T. Freeman,et al.  On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs , 2001, IEEE Trans. Inf. Theory.

[8]  Kevin Murphy,et al.  Bayes net toolbox for Matlab , 1999 .

[9]  L L Looger,et al.  Generalized dead-end elimination algorithms make large-scale protein side-chain structure prediction tractable: implications for protein design and structural genomics. , 2001, Journal of molecular biology.

[10]  M. Levitt,et al.  Accuracy of side‐chain prediction upon near‐native protein backbones generated by ab initio folding methods , 1998, Proteins.

[11]  Robert Cowell,et al.  Introduction to Inference for Bayesian Networks , 1998, Learning in Graphical Models.

[12]  Z. Xiang,et al.  Extending the accuracy limits of prediction for side-chain conformations. , 2001, Journal of molecular biology.

[13]  N. Grishin,et al.  Side‐chain modeling with an optimized scoring function , 2002, Protein science : a publication of the Protein Society.

[14]  Roland L. Dunbrack,et al.  Backbone-dependent rotamer library for proteins. Application to side-chain prediction. , 1993, Journal of molecular biology.

[15]  Johan Desmet,et al.  The dead-end elimination theorem and its use in protein side-chain positioning , 1992, Nature.