Predicting protein domain interactions from coevolution of conserved regions

The knowledge of protein and domain interactions provide crucial insights into their function within a cell. Several computational methods have been proposed to detect interactions between proteins and their constitutive domains. In this work, we focus on approaches based on correlated evolution (coevolution) of sequences of interacting proteins. In this type of approach, often referred to as the mirrortree method, a high correlation of evolutionary histories of two proteins is used as an indicator to predict protein interactions. Recently, it has been observed that subtracting the underlying speciation process by separating coevolution due to common speciation divergence from that due to common function of interacting pairs greatly improves the predictive power of the mirrortree approach. In this article, we investigate possible improvements and limitations of this method. In particular, we demonstrate that the performance of the mirrortree method that can be further improved by restricting the coevolution analysis to the relatively conserved regions in the protein domain sequences (disregarding highly divergent regions). We provide a theoretical validation of our results leading to new insights into the interplay between coevolution and speciation of interacting proteins. Proteins 2007. © 2007 Wiley‐Liss, Inc.

[1]  Gary D Bader,et al.  Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry , 2002, Nature.

[2]  Bruce Rothschild,et al.  Inferring protein interactions from phylogenetic distance matrices , 2003, Bioinform..

[3]  D. Eisenberg,et al.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[4]  A. Valencia,et al.  Correlated mutations contain information about protein-protein interaction. , 1997, Journal of molecular biology.

[5]  D. Eisenberg,et al.  Detecting protein function and protein-protein interactions from genome sequences. , 1999, Science.

[6]  A. Valencia,et al.  Similarity of phylogenetic trees as indicator of protein-protein interaction. , 2001, Protein engineering.

[7]  P. Bork,et al.  Measuring genome evolution. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[8]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[9]  Michael Gribskov,et al.  Use of Receiver Operating Characteristic (ROC) Analysis to Evaluate Sequence Matching , 1996, Comput. Chem..

[10]  Jimin Pei,et al.  AL2CO: calculation of positional conservation in a protein sequence alignment , 2001, Bioinform..

[11]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[12]  M. Sternberg,et al.  Assessing protein co-evolution in the context of the tree of life assists in the prediction of the interactome. , 2005, Journal of molecular biology.

[13]  D. McClish Analyzing a Portion of the ROC Curve , 1989, Medical decision making : an international journal of the Society for Medical Decision Making.

[14]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  B. Snel,et al.  Conservation of gene order: a fingerprint of proteins that physically interact. , 1998, Trends in biochemical sciences.

[16]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[17]  John B. Anderson,et al.  CDD: a Conserved Domain Database for protein classification , 2004, Nucleic Acids Res..

[18]  Ian M. Donaldson,et al.  The Biomolecular Interaction Network Database and related tools 2005 update , 2004, Nucleic Acids Res..

[19]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[20]  Yoshihiro Yamanishi,et al.  The inference of protein-protein interactions by co-evolutionary analysis is improved by excluding the information about the phylogenetic relationships , 2005, Bioinform..

[21]  Arun K. Ramani,et al.  Exploiting the co-evolution of interacting proteins to discover interaction specificity. , 2003, Journal of molecular biology.

[22]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[23]  H. Herzel,et al.  Is there a bias in proteome research? , 2001, Genome research.

[24]  A. Valencia,et al.  In silico two‐hybrid system for the selection of physically interacting protein pairs , 2002, Proteins.

[25]  A. Emili,et al.  Interaction network containing conserved and essential protein complexes in Escherichia coli , 2005, Nature.

[26]  Raja Jothi,et al.  Co-evolutionary analysis of domains in interacting proteins reveals insights into domain-domain interactions mediating protein-protein interactions. , 2006, Journal of molecular biology.

[27]  D. Bamber The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .

[28]  Benjamin A. Shoemaker,et al.  Finding biologically relevant protein domain interactions: Conserved binding mode analysis , 2006, Protein science : a publication of the Protein Society.

[29]  Anton J. Enright,et al.  Protein interaction maps for complete genomes based on gene fusion events , 1999, Nature.

[30]  Mark D'Souza,et al.  Use of contiguity on the chromosome to predict functional coupling , 1998, Silico Biol..

[31]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[32]  F. Cohen,et al.  Co-evolution of proteins with their interaction partners. , 2000, Journal of molecular biology.

[33]  M. Gerstein,et al.  Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. , 2004, Genome research.

[34]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[35]  S. L. Wong,et al.  A Map of the Interactome Network of the Metazoan C. elegans , 2004, Science.

[36]  Teresa M. Przytycka,et al.  Predicting protein-protein interaction by searching evolutionary tree automorphism space , 2005, ISMB.

[37]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[38]  Chern-Sing Goh,et al.  Co-evolutionary analysis reveals insights into protein-protein interactions. , 2002, Journal of molecular biology.