BMC Systems Biology BioMed Central Research article Protein-protein interaction as a predictor of subcellular location

BackgroundMany biological processes are mediated by dynamic interactions between and among proteins. In order to interact, two proteins must co-occur spatially and temporally. As protein-protein interactions (PPIs) and subcellular location (SCL) are discovered via separate empirical approaches, PPI and SCL annotations are independent and might complement each other in helping us to understand the role of individual proteins in cellular networks. We expect reliable PPI annotations to show that proteins interacting in vivo are co-located in the same cellular compartment. Our goal here is to evaluate the potential of using PPI annotation in determining SCL of proteins in human, mouse, fly and yeast, and to identify and quantify the factors that contribute to this complementarity.ResultsUsing publicly available data, we evaluate the hypothesis that interacting proteins must be co-located within the same subcellular compartment. Based on a large, manually curated PPI dataset, we demonstrate that a substantial proportion of interacting proteins are in fact co-located. We develop an approach to predict the SCL of a protein based on the SCL of its interaction partners, given sufficient confidence in the interaction itself. The frequency of false positive PPIs can be reduced by use of six lines of supporting evidence, three based on type of recorded evidence (empirical approach, multiplicity of databases, and multiplicity of literature citations) and three based on type of biological evidence (inferred biological process, domain-domain interactions, and orthology relationships), with biological evidence more-effective than recorded evidence. Our approach performs better than four existing prediction methods in identifying the SCL of membrane proteins, and as well as or better for soluble proteins.ConclusionUnderstanding cellular systems requires knowledge of the SCL of interacting proteins. We show how PPI data can be used more effectively to yield reliable SCL predictions for both soluble and membrane proteins. Scope exists for further improvement in our understanding of cellular function through consideration of the biological context of molecular interactions.

[1]  Olof Emanuelsson,et al.  Predicting Protein Subcellular Localisation From Amino Acid Sequence Information , 2002, Briefings Bioinform..

[2]  C. Deane,et al.  Protein Interactions , 2002, Molecular & Cellular Proteomics.

[3]  Michael T. Hallett,et al.  Refining Protein Subcellular Localization , 2005, PLoS Comput. Biol..

[4]  Yen-Han Lin,et al.  False positive reduction in protein-protein interaction predictions using gene ontology annotations , 2007, BMC Bioinformatics.

[5]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[6]  D. Eisenberg,et al.  Protein interaction databases. , 2001, Current opinion in biotechnology.

[7]  James R. Knight,et al.  A Protein Interaction Map of Drosophila melanogaster , 2003, Science.

[8]  Jenn-Kang Hwang,et al.  Prediction of protein subcellular localization , 2006, Proteins.

[9]  B. Séraphin,et al.  The tandem affinity purification (TAP) method: a general procedure of protein complex purification. , 2001, Methods.

[10]  Robert S. Ledley,et al.  The Protein Information Resource , 2003, Nucleic Acids Res..

[11]  Zheng Yuan,et al.  MemO: A Consensus Approach to the Annotation of a Protein's Membrane Organization , 2006, Silico Biol..

[12]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[13]  Shmuel Sattath,et al.  How reliable are experimental protein-protein interaction data? , 2003, Journal of molecular biology.

[14]  R. Chanet,et al.  Protein interaction mapping: a Drosophila case study. , 2005, Genome research.

[15]  Anthony K. L. Leung,et al.  Nucleolar proteome dynamics , 2005, Nature.

[16]  Yong Zhang,et al.  SPD—a web-based secreted protein database , 2004, Nucleic Acids Res..

[17]  Martin Vingron,et al.  IntAct: an open source molecular interaction database , 2004, Nucleic Acids Res..

[18]  Chikatoshi Kai,et al.  Towards defining the nuclear proteome , 2008, Genome Biology.

[19]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[20]  H. Herzel,et al.  Is there a bias in proteome research? , 2001, Genome research.

[21]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[22]  PagelPhilipp,et al.  The MIPS mammalian protein--protein interaction database , 2005 .

[23]  K. N. Chandrika,et al.  Analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets , 2006, Nature Genetics.

[24]  S. Fields High‐throughput two‐hybrid analysis , 2005, The FEBS journal.

[25]  Natalie Wilson,et al.  Human Protein Reference Database , 2004, Nature Reviews Molecular Cell Biology.

[26]  Haruki Nakamura,et al.  Filtering high-throughput protein-protein interaction data using a combination of genomic features , 2005, BMC Bioinformatics.

[27]  Thomas Lengauer,et al.  A new measure for functional similarity of gene products based on Gene Ontology , 2006, BMC Bioinformatics.

[28]  P. Uetz,et al.  Systematic and large-scale two-hybrid screens. , 2000, Current opinion in microbiology.

[29]  A. Fraser,et al.  A first-draft human protein-interaction map , 2004, Genome Biology.

[30]  M. Gerstein,et al.  Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. , 2004, Genome research.

[31]  B. Schwikowski,et al.  A network of protein–protein interactions in yeast , 2000, Nature Biotechnology.

[32]  M. Vidal,et al.  Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or "interologs". , 2001, Genome research.

[33]  T. Waldmann,et al.  Preassembly of interleukin 2 (IL-2) receptor subunits on resting Kit 225 K6 T cells and their modulation by IL-2, IL-7, and IL-15: a fluorescence resonance energy transfer study. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Chittibabu Guda,et al.  TARGET: a new method for predicting protein subcellular localization in eukaryotes , 2005, Bioinform..

[35]  C. Sander,et al.  The HUPO PSI's Molecular Interaction format—a community standard for the representation of protein interaction data , 2004, Nature Biotechnology.

[36]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[37]  D. Goldberg,et al.  Assessing experimentally derived interactions in a small world , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[38]  Chittibabu Guda,et al.  pTARGET: a web server for predicting protein subcellular localization , 2006, Nucleic Acids Res..

[39]  Jude W. Shavlik,et al.  Machine Learning: Proceedings of the Fifteenth International Conference , 1998 .

[40]  Chittibabu Guda,et al.  Erratum: pTARGET: A new method for predicting protein subcellular localization in eukaryotes (Bioinformatics) vol. 21(21) (3963-3969)) , 2005 .

[41]  R. Ozawa,et al.  A comprehensive two-hybrid analysis to explore the yeast protein interactome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Robert D. Finn,et al.  iPfam: visualization of protein?Cprotein interactions in PDB at domain and amino acid resolutions , 2005, Bioinform..

[43]  Pierre Dönnes,et al.  Predicting Protein Subcellular Localization: Past, Present, and Future , 2004, Genomics, proteomics & bioinformatics.

[44]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[45]  H. Lehrach,et al.  A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome , 2005, Cell.

[46]  P. Bork,et al.  Functional organization of the yeast proteome by systematic analysis of protein complexes , 2002, Nature.

[47]  T. Ito,et al.  Toward a protein-protein interaction map of the budding yeast: A comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[48]  Jun Kawai,et al.  LOCATE: a mouse protein subcellular localization database , 2005, Nucleic Acids Res..

[49]  M. Vidal,et al.  Protein interaction mapping in C. elegans using proteins involved in vulval development. , 2000, Science.

[50]  J. Rothberg,et al.  Gaining confidence in high-throughput protein interaction networks , 2004, Nature Biotechnology.

[51]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[52]  S. Subramaniam,et al.  pTARGET [corrected] a new method for predicting protein subcellular localization in eukaryotes. , 2005, Bioinformatics.

[53]  R. Casadio,et al.  The prediction of protein subcellular localization from sequence: a shortcut to functional genome annotation. , 2008, Briefings in functional genomics & proteomics.

[54]  Paul Horton,et al.  Nucleic Acids Research Advance Access published May 21, 2007 WoLF PSORT: protein localization predictor , 2007 .

[55]  D. Whelan,et al.  THE PROMISE ( AND PERIL ) , 2017 .

[56]  Josefine Sprenger,et al.  Evaluation and comparison of mammalian subcellular localization prediction methods , 2006, BMC Bioinformatics.

[57]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[58]  Cathy H. Wu,et al.  Protein sequence databases. , 2004, Current opinion in chemical biology.

[59]  M. Gerstein,et al.  Subcellular localization of the yeast proteome. , 2002, Genes & development.

[60]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[61]  Patrick Aloy,et al.  The third dimension for protein interactions and complexes. , 2002, Trends in biochemical sciences.

[62]  Kumaran Kandasamy,et al.  An evaluation of human protein-protein interaction data in the public domain , 2006, BMC Bioinformatics.

[63]  Gary D Bader,et al.  Analyzing yeast protein–protein interaction data obtained from different sources , 2002, Nature Biotechnology.

[64]  Trisha N Davis,et al.  Protein localization in proteomics. , 2004, Current opinion in chemical biology.

[65]  M. Moran,et al.  Large-scale mapping of human protein–protein interactions by mass spectrometry , 2007, Molecular systems biology.

[66]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[67]  Zhiyong Lu,et al.  Proteome Analyst: custom predictions with explanations in a web-based tool for high-throughput proteome annotations , 2004, Nucleic Acids Res..

[68]  Zhiyong Lu,et al.  Predicting subcellular localization of proteins using machine-learned classifiers , 2004, Bioinform..

[69]  Erik L. L. Sonnhammer,et al.  Inparanoid: a comprehensive database of eukaryotic orthologs , 2004, Nucleic Acids Res..

[70]  S. Fields,et al.  A novel genetic system to detect protein–protein interactions , 1989, Nature.

[71]  Melissa J. Davis,et al.  Mouse proteome analysis. , 2003, Genome research.

[72]  Gary D. Bader,et al.  BIND-a data specification for storing and describing biomolecular interactions, molecular complexes and pathways , 2000, Bioinform..

[73]  Philip Resnik,et al.  Using Information Content to Evaluate Semantic Similarity in a Taxonomy , 1995, IJCAI.

[74]  Natalie Wilson Human Protein Reference Database , 2004, Nature Reviews Genetics.