Predicting protein interaction interfaces from protein sequences: Case studies of subtilisin and phycocyanin

Identification of protein interaction interfaces is very important for understanding the molecular mechanisms underlying biological phenomena. Here, we present a novel method for predicting protein interaction interfaces from sequences by using PAM matrix (PIFPAM). Sequence alignments for interacting proteins were constructed and parsed into segments using sliding windows. By calculating distance matrix for each segment, the correlation coefficients between segments were estimated. The interaction interfaces were predicted by extracting highly correlated segment pairs from the correlation map. The predictions achieved an accuracy 0.41–0.71 for eight intraprotein interaction examples, and 0.07–0.60 for four interprotein interaction examples. Compared with three previously published methods, PIFPAM predicted more contacting site pairs for 11 out of the 12 example proteins, and predicted at least 34% more contacting site pairs for eight proteins of them. The factors affecting the predictions were also analyzed. Since PIFPAM uses only the alignments of the two interacting proteins as input, it is especially useful when no three‐dimensional protein structure data are available. Proteins 2008. © 2007 Wiley‐Liss, Inc.

[1]  Peter J Bickel,et al.  Finding important sites in protein sequences , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[2]  A. Glazer,et al.  Light harvesting by phycobilisomes. , 1985, Annual review of biophysics and biophysical chemistry.

[3]  A. Valencia,et al.  Automatic methods for predicting functionally important residues. , 2003, Journal of molecular biology.

[4]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[5]  Orkun S. Soyer,et al.  Predicting functional sites in proteins: site-specific evolutionary models and their application to neurotransmitter transporters. , 2004, Journal of molecular biology.

[6]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[7]  D. Eisenberg,et al.  Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins. , 2001, Journal of molecular biology.

[8]  F. Allnutt,et al.  Red, Cryptomonad and Glaucocystophyte Algal Phycobiliproteins , 2003 .

[9]  J. Kraut,et al.  Structure of subtilisin BPN' at 2.5 angström resolution. , 1969, Nature.

[10]  Dennis R Livesay,et al.  The evolutionary origins and catalytic importance of conserved electrostatic networks within TIM‐barrel proteins , 2005, Protein science : a publication of the Protein Society.

[11]  P Fariselli,et al.  Prediction of contact maps with neural networks and correlated mutations. , 2001, Protein engineering.

[12]  F. Cohen,et al.  Co-evolution of proteins with their interaction partners. , 2000, Journal of molecular biology.

[13]  A. Bulpitt,et al.  Insights into protein-protein interfaces using a Bayesian network prediction method. , 2006, Journal of molecular biology.

[14]  R. Aldrich,et al.  Influence of conservation on calculations of amino acid covariance in multiple sequence alignments , 2004, Proteins.

[15]  Brian T. Sutch,et al.  Predicting protein functional sites with phylogenetic motifs , 2004, Proteins.

[16]  N. Ben-Tal,et al.  ConSurf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic information. , 2001, Journal of molecular biology.

[17]  A. Valencia,et al.  In silico two‐hybrid system for the selection of physically interacting protein pairs , 2002, Proteins.

[18]  N. Guex,et al.  SWISS‐MODEL and the Swiss‐Pdb Viewer: An environment for comparative protein modeling , 1997, Electrophoresis.

[19]  Bruce Rothschild,et al.  Inferring protein interactions from phylogenetic distance matrices , 2003, Bioinform..

[20]  Jeffrey J. Gray,et al.  High-resolution protein-protein docking. , 2006, Current opinion in structural biology.

[21]  Pedro Alexandrino Fernandes,et al.  Protein–ligand docking: Current status and future challenges , 2006, Proteins.

[22]  R. Ranganathan,et al.  Evolutionarily conserved pathways of energetic connectivity in protein families. , 1999, Science.

[23]  Raja Jothi,et al.  Co-evolutionary analysis of domains in interacting proteins reveals insights into domain-domain interactions mediating protein-protein interactions. , 2006, Journal of molecular biology.

[24]  S. Jones,et al.  Prediction of protein-protein interaction sites using patch analysis. , 1997, Journal of molecular biology.

[25]  J. Kraut,et al.  Structure of Subtilisin BPN′ at 2.5 Å Resolution , 1969, Nature.

[26]  F. Cohen,et al.  An evolutionary trace method defines binding surfaces common to protein families. , 1996, Journal of molecular biology.

[27]  A. Valencia,et al.  Correlated mutations contain information about protein-protein interaction. , 1997, Journal of molecular biology.

[28]  Sarah A. Teichmann,et al.  Principles of protein-protein interactions , 2002, ECCB.

[29]  Peng Chen,et al.  Predicting protein interaction sites from residue spatial sequence profile and evolution rate , 2006, FEBS Letters.

[30]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[31]  Z. Weng,et al.  Structure, function, and evolution of transient and obligate protein-protein interactions. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[32]  A. Horovitz,et al.  Mapping pathways of allosteric communication in GroEL by analysis of correlated mutations , 2002, Proteins.

[33]  C. Sander,et al.  Correlated mutations and residue contacts in proteins , 1994, Proteins.

[34]  S. Jones,et al.  Analysis of protein-protein interaction sites using surface patches. , 1997, Journal of molecular biology.

[35]  W. Sidler,et al.  Phycobilisome and Phycobiliprotein Structures , 1994 .

[36]  Yiannis Kaznessis,et al.  Prediction of distant residue contacts with the use of evolutionary information , 2005, Proteins.

[37]  B. Rost,et al.  Effective use of sequence correlation and conservation in fold recognition. , 1999, Journal of molecular biology.

[38]  Chern-Sing Goh,et al.  Co-evolutionary analysis reveals insights into protein-protein interactions. , 2002, Journal of molecular biology.

[39]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[40]  C. Sander,et al.  Correlated Mutations and Residue Contacts , 1994 .

[41]  W. Bode,et al.  The high-resolution X-ray crystal structure of the complex formed between subtilisin Carlsberg and eglin c, an elastase inhibitor from the leech Hirudo medicinalis. Structural analysis, subtilisin structure and interface geometry. , 1987, European journal of biochemistry.

[42]  Huan‐Xiang Zhou,et al.  Prediction of protein interaction sites from sequence profile and residue neighbor list , 2001, Proteins.

[43]  Costas D Maranas,et al.  Using multiple sequence correlation analysis to characterize functionally important protein regions. , 2003, Protein engineering.

[44]  B. Rost,et al.  Predicted protein–protein interaction sites from local sequence information , 2003, FEBS letters.

[45]  M. Mimuro,et al.  Antenna Systems and Energy Transfer in Cyanophyta and Rhodophyta , 2003 .

[46]  A. Valencia,et al.  Computational methods for the prediction of protein interactions. , 2002, Current opinion in structural biology.

[47]  R. Maccoll,et al.  Cyanobacterial phycobilisomes , 1998, Journal of structural biology.