Approximate Inference and Side-chain Prediction

Side-chain prediction is an important subtask in the protein-folding problem. We show that finding a minimal energy side-chain configuration is equivalent to performing inference in an undirected graphical model. The graphical model is relatively sparse yet has many cycles. We used this equivalence to assess the performance of approximate inference algorithms in a real-world setting. Specifically, we were interested in two questions: (1) which approximate inference algorithms give superior performance and (2) how does this performance compare to the state-of-the-art in computational biology. We looked at three tasks in side-chain graphical models — finding the minimal energy configuration, finding the M best configurations and approximating the free energy and conformational entropy. In all three subtasks we found that belief propagation gave the best results among the approximate inference algorithms and in many cases it outperformed the state-of-the-art in algorithms developed in the computational biology field.

[1]  Yair Weiss,et al.  Finding the M Most Probable Configurations in Arbitrary Graphical Models , 2003, NIPS.

[2]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[3]  D. Baker,et al.  Native protein sequences are close to optimal for their structures. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Niles A Pierce,et al.  Protein design is NP-hard. , 2002, Protein engineering.

[5]  R. Goldstein Efficient rotamer elimination applied to protein side-chains and related spin glasses. , 1994, Biophysical journal.

[6]  Kevin Murphy,et al.  Bayes net toolbox for Matlab , 1999 .

[7]  Robert Cowell,et al.  Advanced Inference in Bayesian Networks , 1999, Learning in Graphical Models.

[8]  M. Karplus,et al.  Effective energy function for proteins in solution , 1999, Proteins.

[9]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[10]  M. Levitt,et al.  Accuracy of side‐chain prediction upon near‐native protein backbones generated by ab initio folding methods , 1998, Proteins.

[11]  Adrian A Canutescu,et al.  A graph‐theory algorithm for rapid protein side‐chain prediction , 2003, Protein science : a publication of the Protein Society.

[12]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[13]  D. Nilsson,et al.  An efficient algorithm for finding the M most probable configurationsin probabilistic expert systems , 1998, Stat. Comput..

[14]  A R Leach,et al.  Exploring the conformational space of protein side chains using dead‐end elimination and the A* algorithm , 1998, Proteins.

[15]  Brendan J. Frey,et al.  Very loopy belief propagation for unwrapping phase images , 2001, NIPS.

[16]  William T. Freeman,et al.  Learning to Estimate Scenes from Images , 1998, NIPS.

[17]  Aviezri S. Fraenkel Protein folding, spin glass and computational complexity , 1997, DNA Based Computers.

[18]  Johan Desmet,et al.  The dead-end elimination theorem and its use in protein side-chain positioning , 1992, Nature.

[19]  D. Baker,et al.  An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and protein-protein complexes. , 2003, Journal of molecular biology.