Inferring high-confidence human protein-protein interactions

BackgroundAs numerous experimental factors drive the acquisition, identification, and interpretation of protein-protein interactions (PPIs), aggregated assemblies of human PPI data invariably contain experiment-dependent noise. Ascertaining the reliability of PPIs collected from these diverse studies and scoring them to infer high-confidence networks is a non-trivial task. Moreover, a large number of PPIs share the same number of reported occurrences, making it impossible to distinguish the reliability of these PPIs and rank-order them. For example, for the data analyzed here, we found that the majority (>83%) of currently available human PPIs have been reported only once.ResultsIn this work, we proposed an unsupervised statistical approach to score a set of diverse, experimentally identified PPIs from nine primary databases to create subsets of high-confidence human PPI networks. We evaluated this ranking method by comparing it with other methods and assessing their ability to retrieve protein associations from a number of diverse and independent reference sets. These reference sets contain known biological data that are either directly or indirectly linked to interactions between proteins. We quantified the average effect of using ranked protein interaction data to retrieve this information and showed that, when compared to randomly ranked interaction data sets, the proposed method created a larger enrichment (~134%) than either ranking based on the hypergeometric test (~109%) or occurrence ranking (~46%).ConclusionsFrom our evaluations, it was clear that ranked interactions were always of value because higher-ranked PPIs had a higher likelihood of retrieving high-confidence experimental data. Reducing the noise inherent in aggregated experimental PPIs via our ranking scheme further increased the accuracy and enrichment of PPIs derived from a number of biologically relevant data sets. These results suggest that using our high-confidence protein interactions at different levels of confidence will help clarify the topological and biological properties associated with human protein networks.

[1]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[2]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[3]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[4]  Jaques Reifman,et al.  Categorizing Biases in High-Confidence High-Throughput Protein-Protein Interaction Data Sets* , 2011, Molecular & Cellular Proteomics.

[5]  H. Lehrach,et al.  A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome , 2005, Cell.

[6]  Lucy Skrabanek,et al.  PDZBase: a protein?Cprotein interaction database for PDZ-domains , 2005, Bioinform..

[7]  R. Vallee,et al.  Molecular structure of cytoplasmic dynein 2 and its distribution in neuronal and ciliated cells , 2002, Journal of Cell Science.

[8]  Jesse Gillis,et al.  The Impact of Multifunctional Genes on "Guilt by Association" Analysis , 2011, PloS one.

[9]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[10]  C. V. Jongeneel,et al.  An atlas of human gene expression from massively parallel signature sequencing (MPSS). , 2005, Genome research.

[11]  J. Reifman,et al.  Influence of Protein Abundance on High-Throughput Protein-Protein Interaction Detection , 2009, PloS one.

[12]  Ian M. Donaldson,et al.  BIND: the Biomolecular Interaction Network Database , 2001, Nucleic Acids Res..

[13]  B. Göke,et al.  Molecular cloning, functional expression, and signal transduction of the GIP‐receptor cloned from a human insulinoma , 1995, FEBS letters.

[14]  Ting Chen,et al.  Assessment of the reliability of protein-protein interactions and protein function prediction , 2002, Pacific Symposium on Biocomputing.

[15]  J. McIntosh,et al.  Identification of a novel light intermediate chain (D2LIC) for mammalian cytoplasmic dynein 2. , 2002, Molecular biology of the cell.

[16]  Insuk Lee,et al.  A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality , 2007, BMC Bioinformatics.

[17]  PagelPhilipp,et al.  The MIPS mammalian protein--protein interaction database , 2005 .

[18]  Jaques Reifman,et al.  Unraveling the conundrum of seemingly discordant protein-protein interaction datasets , 2010, 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology.

[19]  M. Moran,et al.  Large-scale mapping of human protein–protein interactions by mass spectrometry , 2007, Molecular systems biology.

[20]  Mathieu Blanchette,et al.  Systematic analysis of the protein interaction network for the human transcription machinery reveals the identity of the 7SK capping enzyme. , 2007, Molecular cell.

[21]  L. Holm,et al.  The Pfam protein families database , 2005, Nucleic Acids Res..

[22]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[23]  Thomas Pfeiffer,et al.  Large-Scale Assessment of the Effect of Popularity on the Reliability of Research , 2009, PloS one.

[24]  James Vlasblom,et al.  Challenges and Rewards of Interaction Proteomics * , 2009, Molecular & Cellular Proteomics.

[25]  C. Deane,et al.  Protein Interactions , 2002, Molecular & Cellular Proteomics.

[26]  Dmitrij Frishman,et al.  The MIPS mammalian protein?Cprotein interaction database , 2005, Bioinform..

[27]  Liisa Holm,et al.  Evaluation of different domain-based methods in protein interaction prediction. , 2009, Biochemical and biophysical research communications.

[28]  D. Goldberg,et al.  Assessing experimentally derived interactions in a small world , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Song Liu,et al.  A simple reference state makes a significant improvement in near-native selections from structurally refined docking decoys , 2007, Proteins.

[30]  Hanno Steen,et al.  Development of human protein reference database as an initial platform for approaching systems biology in humans. , 2003, Genome research.

[31]  R. Pederson,et al.  Structure-function analysis of a series of novel GIP analogues containing different helical length linkers. , 2003, Biochemistry.

[32]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[33]  Y. Gilad,et al.  Characterizing the expression of the human olfactory receptor gene family using a novel DNA microarray , 2007, Genome biology.

[34]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[35]  M. DePamphilis,et al.  HUMAN DISEASE , 1957, The Ulster Medical Journal.

[36]  S. Gygi,et al.  Defining the Human Deubiquitinating Enzyme Interaction Landscape , 2009, Cell.

[37]  S. Lovell,et al.  Protein-protein interaction networks and biology—what's the connection? , 2008, Nature Biotechnology.

[38]  Jun Li,et al.  Molecular mechanism for regulation of the human mitochondrial branched-chain alpha-ketoacid dehydrogenase complex by phosphorylation. , 2004, Structure.

[39]  Ben Lehner,et al.  Tissue specificity and the human protein interaction network , 2009, Molecular systems biology.

[40]  Roded Sharan,et al.  BMC Bioinformatics BioMed Central , 2006 .

[41]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[42]  Robert D. Finn,et al.  iPfam: visualization of protein?Cprotein interactions in PDB at domain and amino acid resolutions , 2005, Bioinform..

[43]  angesichts der Corona-Pandemie,et al.  UPDATE , 1973, The Lancet.

[44]  Y. Seino,et al.  Physiology of GIP - A Lesson from GIP Receptor Knockout Mice , 2004, Hormone and metabolic research = Hormon- und Stoffwechselforschung = Hormones et metabolisme.

[45]  D. Zack,et al.  Computational analysis of tissue-specific combinatorial gene regulation: predicting interaction between transcription factors in human tissues , 2006, Nucleic acids research.

[46]  BMC Bioinformatics , 2005 .

[47]  G. Zhu,et al.  Ligand Binding Sites of Inducible Costimulator and High Avidity Mutants with Improved Function , 2002, The Journal of experimental medicine.

[48]  B. Gallwitz,et al.  GLP-1 GIP chimeric peptides define the structural requirements for specific ligand-receptor interaction of GLP-1 , 1996, Regulatory Peptides.

[49]  A. Barabasi,et al.  High-Quality Binary Protein Interaction Map of the Yeast Interactome Network , 2008, Science.

[50]  O. A. Cabello,et al.  Cell cycle-dependent expression and nucleolar localization of hCAP-H. , 2001, Molecular biology of the cell.

[51]  G. Zhu,et al.  Costimulation of T cells by B7-H2, a B7-like molecule that binds ICOS. , 2000, Blood.

[52]  M. Menéndez,et al.  Interaction between L-threonine dehydrogenase and aminoacetone synthetase and mechanism of aminoacetone production. , 1986, The Journal of biological chemistry.

[53]  Jaques Reifman,et al.  A Novel Scoring Approach for Protein Co-Purification Data Reveals High Interaction Specificity , 2009, PLoS Comput. Biol..

[54]  M. Daly,et al.  Guilt by association , 2000, Nature Genetics.

[55]  E. Zotenko,et al.  Inferring Physical Protein Contacts from Large-Scale Purification Data of Protein Complexes* , 2011, Molecular & Cellular Proteomics.

[56]  K. Scholich,et al.  Toponomics: studying protein-protein interactions and protein networks in intact tissue. , 2010, Molecular bioSystems.