Protein docking with predicted constraints

This paper presents a constraint-based method for improving protein docking results. Efficient constraint propagation cuts over 95% of the search time for finding the configurations with the largest contact surface, provided a contact is specified between two amino acid residues. This makes it possible to scan a large number of potentially correct constraints, lowering the requirements for useful contact predictions. While other approaches are very dependent on accurate contact predictions, ours requires only that at least one correct contact be retained in a set of, for example, one hundred constraints to test. It is this feature that makes it feasible to use readily available sequence data to predict specific potential contacts. Although such prediction is too inaccurate for most purposes, we demonstrate with a Naïve Bayes Classifier that it is accurate enough to more than double the average number of acceptable models retained during the crucial filtering stage of protein docking when combined with our constrained docking algorithm. All software developed in this work is freely available as part of the Open Chemera Library.

[1]  J. Janin Assessing predictions of protein–protein interaction: The CAPRI experiment , 2005, Protein science : a publication of the Protein Society.

[2]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[3]  Alexandre G. de Brevern,et al.  A Novel Evaluation of Residue and Protein Volumes by Means of Laguerre Tessellation , 2010, J. Chem. Inf. Model..

[4]  C. Dominguez,et al.  HADDOCK: a protein-protein docking approach based on biochemical or biophysical information. , 2003, Journal of the American Chemical Society.

[5]  N. Ben-Tal,et al.  Residue frequencies and pairing preferences at protein–protein interfaces , 2001, Proteins.

[6]  Ludwig Krippahl,et al.  Modeling protein complexes with BiGGER , 2003, Proteins.

[7]  L. Krippahl,et al.  Modulation of the proteolytic activity of matrix metalloproteinase-2 (gelatinase A) on fibrinogen. , 2007, The Biochemical journal.

[8]  C. Sander,et al.  Direct-coupling analysis of residue coevolution captures native contacts across many protein families , 2011, Proceedings of the National Academy of Sciences.

[9]  Lluís A. Belanche Muñoz,et al.  Feature selection algorithms: a survey and experimental evaluation , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[10]  Saraswathi Vishveshwara,et al.  Amino acid interaction preferences in proteins , 2010, Protein science : a publication of the Protein Society.

[11]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[12]  Alexandre M. J. J. Bonvin,et al.  CPORT: A Consensus Interface Predictor and Its Performance in Prediction-Driven Docking with HADDOCK , 2011, PloS one.

[13]  Ludwig Krippahl,et al.  Synechocystis ferredoxin/ferredoxin‐NADP+‐reductase/NADP+ complex: Structural model obtained by NMR‐restrained docking , 2005, FEBS letters.

[14]  Raphaël Guerois,et al.  Coevolution at protein complex interfaces can be detected by the complementarity trace with important impact for predictive docking , 2008, Proceedings of the National Academy of Sciences.

[15]  C. Costa,et al.  Cytochrome c(550) from Paracoccus denitrificans - Interaction with cytochrome c peroxidase , 1999 .

[16]  G. Gonnet,et al.  Exhaustive matching of the entire protein sequence database. , 1992, Science.

[17]  Ying Yang,et al.  A comparative study of discretization methods for naive-Bayes classifiers , 2002 .

[18]  Marc F Lensink,et al.  Docking and scoring protein interactions: CAPRI 2009 , 2010, Proteins.

[19]  Zhiping Weng,et al.  Protein–protein docking benchmark version 4.0 , 2010, Proteins.

[20]  C. Costa,et al.  Cytochrome c peroxidase and its redox partners - binary and ternary complexes , 2001 .

[21]  Pedro Barahona,et al.  Constraining Protein Docking with Coevolution Data for Medical Research , 2013, AIME.

[22]  S. Tonegawa,et al.  Somatic generation of antibody diversity. , 1976, Nature.

[23]  Pedro Barahona,et al.  Applying Constraint Programming to Rigid Body Protein Docking , 2005, CP.

[24]  D. Higgins,et al.  Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega , 2011, Molecular systems biology.

[25]  L. Krippahl,et al.  BiGGER: A new (soft) docking algorithm for predicting protein interactions , 2000, Proteins.

[26]  M Czjzek,et al.  Heteronuclear NMR and soft docking: an experimental approach for a structural model of the cytochrome c553-ferredoxin complex. , 2000, Biochemistry.

[27]  Marc F Lensink,et al.  Docking, scoring, and affinity prediction in CAPRI , 2013, Proteins.

[28]  Peter B. McGarvey,et al.  UniRef: comprehensive and non-redundant UniProt reference clusters , 2007, Bioinform..

[29]  R. Ranganathan,et al.  Evolutionarily conserved pathways of energetic connectivity in protein families. , 1999, Science.

[30]  Peter van Beek,et al.  Principles and Practice of Constraint Programming - CP 2005, 11th International Conference, CP 2005, Sitges, Spain, October 1-5, 2005, Proceedings , 2005, CP.

[31]  Ruth Nussinov,et al.  Principles of docking: An overview of search algorithms and a guide to scoring functions , 2002, Proteins.

[32]  Sandor Vajda,et al.  CAPRI: A Critical Assessment of PRedicted Interactions , 2003, Proteins.

[33]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[34]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[35]  J. Janin,et al.  Computer analysis of protein-protein interaction. , 1978, Journal of molecular biology.

[36]  Bin Li,et al.  Protein docking prediction using predicted protein-protein interface , 2012, BMC Bioinformatics.

[37]  C. Costa,et al.  The Structure of an Electron Transfer Complex Containing a Cytochrome c and a Peroxidase* , 1999, The Journal of Biological Chemistry.