HINT: High-quality protein interactomes and their applications in understanding human disease

BackgroundA global map of protein-protein interactions in cellular systems provides key insights into the workings of an organism. A repository of well-validated high-quality protein-protein interactions can be used in both large- and small-scale studies to generate and validate a wide range of functional hypotheses.ResultsWe develop HINT (http://hint.yulab.org) - a database of high-quality protein-protein interactomes for human, Saccharomyces cerevisiae, Schizosaccharomyces pombe, and Oryza sativa. These were collected from several databases and filtered both systematically and manually to remove low-quality/erroneous interactions. The resulting datasets are classified by type (binary physical interactions vs. co-complex associations) and data source (high-throughput systematic setups vs. literature-curated small-scale experiments). We find strong sociological sampling biases in literature-curated datasets of small-scale interactions. An interactome without such sampling biases was used to understand network properties of human disease-genes - hubs are unlikely to cause disease, but if they do, they usually cause multiple disorders.ConclusionsHINT is of significant interest to researchers in all fields of biology as it addresses the ubiquitous need of having a repository of high-quality protein-protein interactions. These datasets can be utilized to generate specific hypotheses about specific proteins and/or pathways, as well as analyzing global properties of cellular networks. HINT will be regularly updated and all versions will be tracked.

[1]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[2]  M. Tyers,et al.  Stratus Not Altocumulus: A New View of the Yeast Protein Interaction Network , 2006, PLoS biology.

[3]  T. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2010, Nucleic Acids Res..

[4]  A. Barabasi,et al.  Interactome Networks and Human Disease , 2011, Cell.

[5]  Damian Szklarczyk,et al.  The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored , 2010, Nucleic Acids Res..

[6]  Haiyuan Yu,et al.  Genome-scale analysis of interaction dynamics reveals organization of biological networks , 2012, Bioinform..

[7]  M. Vidal,et al.  Interactome: gateway into systems biology. , 2005, Human molecular genetics.

[8]  Marc Vidal,et al.  Confirmation of Organized Modularity in the Yeast Interactome , 2007, PLoS biology.

[9]  T. Pawson,et al.  Protein-protein interactions define specificity in signal transduction. , 2000, Genes & development.

[10]  C. Ball,et al.  Genetic and physical maps of Saccharomyces cerevisiae. , 1997, Nature.

[11]  Livia Perfetto,et al.  MINT, the molecular interaction database: 2009 update , 2009, Nucleic Acids Res..

[12]  Ian M. Donaldson,et al.  iRefWeb: interactive analysis of consolidated protein interaction data and their supporting evidence , 2010, Database J. Biol. Databases Curation.

[13]  D. Vitkup,et al.  Network properties of genes harboring inherited disease mutations , 2008, Proceedings of the National Academy of Sciences.

[14]  Gary D Bader,et al.  Analyzing yeast protein–protein interaction data obtained from different sources , 2002, Nature Biotechnology.

[15]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[16]  M. Tyers,et al.  Still Stratus Not Altocumulus: Further Evidence against the Date/Party Hub Distinction , 2007, PLoS biology.

[17]  Fabian J. Theis,et al.  MIPS: curated databases and comprehensive secondary data resources in 2010 , 2010, Nucleic Acids Res..

[18]  Adam J. Smith,et al.  The Database of Interacting Proteins: 2004 update , 2004, Nucleic Acids Res..

[19]  Ian M. Donaldson,et al.  Literature curation of protein interactions: measuring agreement across major public databases , 2010, Database J. Biol. Databases Curation.

[20]  Kara Dolinski,et al.  The BioGRID Interaction Database: 2011 update , 2010, Nucleic Acids Res..

[21]  A. Gavin,et al.  Interaction networks for systems biology , 2008, FEBS letters.

[22]  M. Vidal,et al.  Edgetic perturbation of a C. elegans BCL2 ortholog , 2009, Nature Methods.

[23]  C. Sander,et al.  The HUPO PSI's Molecular Interaction format—a community standard for the representation of protein interaction data , 2004, Nature Biotechnology.

[24]  Ioannis Xenarios,et al.  DIP: The Database of Interacting Proteins: 2001 update , 2001, Nucleic Acids Res..

[25]  Alan F. Scott,et al.  McKusick's Online Mendelian Inheritance in Man (OMIM®) , 2008, Nucleic Acids Res..

[26]  M. Vidal,et al.  Literature-curated protein interaction , 2009 .

[27]  Marc Vidal,et al.  Interactome modeling , 2005, FEBS letters.

[28]  P. Stenson,et al.  The Human Gene Mutation Database: 2008 update , 2009, Genome Medicine.

[29]  M. Gerstein,et al.  Genomic analysis of essentiality within protein networks. , 2004, Trends in genetics : TIG.

[30]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[31]  Sandhya Rani,et al.  Human Protein Reference Database—2009 update , 2008, Nucleic Acids Res..

[32]  Lan V. Zhang,et al.  Evidence for dynamically organized modularity in the yeast protein–protein interaction network , 2004, Nature.

[33]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[34]  M. DePamphilis,et al.  HUMAN DISEASE , 1957, The Ulster Medical Journal.

[35]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[36]  Matthew Berriman,et al.  GeneDB: a resource for prokaryotic and eukaryotic organisms , 2004, Nucleic Acids Res..

[37]  Haiyuan Yu,et al.  Three-dimensional reconstruction of protein networks provides insight into human genetic disease , 2012, Nature Biotechnology.

[38]  César A. Hidalgo,et al.  Scale-free networks , 2008, Scholarpedia.

[39]  A. Barabasi,et al.  High-Quality Binary Protein Interaction Map of the Yeast Interactome Network , 2008, Science.

[40]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.

[41]  A. Barabasi,et al.  An empirical framework for binary interactome mapping , 2008, Nature Methods.

[42]  Robert Hoffmann,et al.  Temporal patterns of genes in scientific publications , 2007, Proceedings of the National Academy of Sciences.

[43]  M. Vidal,et al.  Edgetic perturbation models of human inherited disorders , 2009, Molecular systems biology.

[44]  Y. Hiraoka,et al.  ORFeome cloning and global analysis of protein localization in the fission yeast Schizosaccharomyces pombe , 2006, Nature Biotechnology.

[45]  Teri A Manolio,et al.  Genomewide association studies and assessment of the risk of disease. , 2010, The New England journal of medicine.

[46]  Yan Wang,et al.  VisANT 3.5: multi-scale network visualization, analysis and inference based on the gene ontology , 2009, Nucleic Acids Res..

[47]  Rafael C. Jimenez,et al.  The IntAct molecular interaction database in 2012 , 2011, Nucleic Acids Res..

[48]  M. Vidal,et al.  Literature-curated protein interaction datasets , 2009, Nature Methods.

[49]  Michael Krawczak,et al.  The human gene mutation database , 1998, Nucleic Acids Res..