Prediction of protein beta-residue contacts by Markov logic networks with grounding-specific weights

MOTIVATION Accurate prediction of contacts between beta-strand residues can significantly contribute towards ab initio prediction of the 3D structure of many proteins. Contacts in the same protein are highly interdependent. Therefore, significant improvements can be expected by applying statistical relational learners that overcome the usual machine learning assumption that examples are independent and identically distributed. Furthermore, the dependencies among beta-residue contacts are subject to strong regularities, many of which are known a priori. In this article, we take advantage of Markov logic, a statistical relational learning framework that is able to capture dependencies between contacts, and constrain the solution according to domain knowledge expressed by means of weighted rules in a logical language. RESULTS We introduce a novel hybrid architecture based on neural and Markov logic networks with grounding-specific weights. On a non-redundant dataset, our method achieves 44.9% F(1) measure, with 47.3% precision and 42.7% recall, which is significantly better (P < 0.01) than previously reported performance obtained by 2D recursive neural networks. Our approach also significantly improves the number of chains for which beta-strands are nearly perfectly paired (36% of the chains are predicted with F(1) >or= 70% on coarse map). It also outperforms more general contact predictors on recent CASP 2008 targets.

[1]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[2]  P Fariselli,et al.  Prediction of contact maps with neural networks and correlated mutations. , 2001, Protein engineering.

[3]  Jianlin Cheng A multi-template combination algorithm for protein comparative modeling , 2008, BMC Structural Biology.

[4]  Ben Taskar,et al.  Introduction to statistical relational learning , 2007 .

[5]  Pierre Baldi,et al.  Improved residue contact prediction using support vector machines and a large feature set , 2007, BMC Bioinformatics.

[7]  Arthur M Lesk,et al.  Contact patterns between helices and strands of sheet define protein folding patterns , 2007, Proteins.

[8]  Richard Bonneau,et al.  Distributions of beta sheets in proteins with application to structure prediction , 2002, Proteins.

[9]  Pierre Baldi,et al.  Three-stage prediction of protein ?-sheets by neural networks, alignments and graph algorithms , 2005, ISMB.

[10]  Pierre Baldi,et al.  Matching Protein b-Sheet Partners by Feedforward and Recurrent Neural Networks , 2000, ISMB.

[11]  Luc De Raedt,et al.  Probabilistic Inductive Logic Programming , 2004, Probabilistic Inductive Logic Programming.

[12]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[13]  Burkhard Rost,et al.  PROFcon: novel prediction of long-range contacts , 2005, Bioinform..

[14]  Mr. Aditya Nikam,et al.  DESIGN REVIEW , 2007 .

[15]  Christodoulos A. Floudas,et al.  Advances in protein structure prediction and de novo protein design : A review , 2006 .

[16]  W. Kabsch,et al.  Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features , 1983, Biopolymers.

[17]  Pierre Baldi,et al.  Modular DAG-RNN Architectures for Assembling Coarse Protein Structures , 2006, J. Comput. Biol..

[18]  Pierre Baldi,et al.  SCRATCH: a protein structure and structural feature prediction server , 2005, Nucleic Acids Res..

[19]  G. Wahba Support vector machines, reproducing kernel Hilbert spaces, and randomized GACV , 1999 .

[20]  K. Gunasekaran,et al.  Beta-hairpins in proteins revisited: lessons for de novo design. , 1997, Protein engineering.

[21]  S Brunak,et al.  Matching protein beta-sheet partners by feedforward and recurrent neural networks. , 2000, Proceedings. International Conference on Intelligent Systems for Molecular Biology.

[22]  Dan Roth,et al.  On the Hardness of Approximate Reasoning , 1993, IJCAI.

[23]  J. M. Hammersley,et al.  Markov fields on finite graphs and lattices , 1971 .

[24]  Kevin Karplus,et al.  PREDICT-2ND: a tool for generalized protein local structure prediction , 2008, Bioinform..

[25]  Bart Selman,et al.  A general stochastic approach to solving problems with hard and soft constraints , 1996, Satisfiability Problem: Theory and Applications.

[26]  Luc De Raedt,et al.  Kernels and Distances for Structured Data , 2008 .

[27]  Matthew Richardson,et al.  Markov Logic , 2008, Probabilistic Inductive Logic Programming.

[28]  Matthew Richardson,et al.  The Alchemy System for Statistical Relational AI: User Manual , 2007 .