Graphical Features of Functional Genes in Human Protein Interaction Network

With the completion of the human genome project, it is feasible to investigate large-scale human protein interaction network (HPIN) with complex networks theory. Proteins are encoded by genes. Essential, viable, disease, conserved, housekeeping (HK) and tissue-enriched (TE) genes are functional genes, which are organized and functioned via interaction networks. Based on up-to-date data from various databases or literature, two large-scale HPINs and six subnetworks are constructed. We illustrate that the HPINs and most of the subnetworks are sparse, small-world, scale-free, disassortative and with hierarchical modularity. Among the six subnetworks, essential, disease and HK subnetworks are more densely connected than the others. Statistical analysis on the topological structures of the HPIN reveals that the lethal, the conserved, the HK and the TE genes are with hallmark graphical features. Receiver operating characteristic (ROC) curves indicate that the essential genes can be distinguished from the viable ones with accuracy as high as almost 70%. Closeness, semi-local and eigenvector centralities can distinguish the HK genes from the TE ones with accuracy around 82%. Furthermore, the Venn diagram, cluster dendgrams and classifications of disease genes reveal that some classes of disease genes are with hallmark graphical features, especially for cancer genes, HK disease genes and TE disease genes. The findings facilitate the identification of some functional genes via topological structures. The investigations shed some light on the characteristics of the compete interactome, which have potential implications in networked medicine and biological network control.

[1]  Yuval Shavitt,et al.  A model of Internet topology using k-shell decomposition , 2007, Proceedings of the National Academy of Sciences.

[2]  Ioannis Xenarios,et al.  DIP: the Database of Interacting Proteins , 2000, Nucleic Acids Res..

[3]  H. Lehrach,et al.  A Human Protein-Protein Interaction Network: A Resource for Annotating the Proteome , 2005, Cell.

[4]  Antonio Reverter,et al.  Mining tissue specificity, gene connectivity and disease association to reveal a set of genes that modify the action of disease causing genes , 2008, BioData Mining.

[5]  R. Karp,et al.  From the Cover : Conserved patterns of protein interaction in multiple species , 2005 .

[6]  J. Castle,et al.  Definition, conservation and epigenetics of housekeeping and tissue-enriched genes , 2009, BMC Genomics.

[7]  A. Barabasi,et al.  The human disease network , 2007, Proceedings of the National Academy of Sciences.

[8]  A. Barabasi,et al.  Drug—target network , 2007, Nature Biotechnology.

[9]  E. Levanon,et al.  Human housekeeping genes are compact. , 2003, Trends in genetics : TIG.

[10]  A. Barabasi,et al.  Network medicine : a network-based approach to human disease , 2010 .

[11]  Maricel G. Kann,et al.  Chapter 4: Protein Interactions and Disease , 2012, PLoS Comput. Biol..

[12]  Jinhu Lu,et al.  Duplication and Divergence Effect on Network Motifs in Undirected Bio-Molecular Networks. , 2015, IEEE transactions on biomedical circuits and systems.

[13]  M. Newman,et al.  Random graphs with arbitrary degree distributions and their applications. , 2000, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  A. Barabasi,et al.  High-Quality Binary Protein Interaction Map of the Yeast Interactome Network , 2008, Science.

[15]  Carsten Wiuf,et al.  Subnets of scale-free networks are not scale-free: sampling properties of networks. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[17]  A. Butte,et al.  Further defining housekeeping, or "maintenance," genes Focus on "A compendium of gene expression in normal human tissues". , 2001, Physiological genomics.

[18]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[19]  E. Koonin,et al.  Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. , 2002, Genome research.

[20]  M. Vidal,et al.  Effect of sampling on topology predictions of protein-protein interaction networks , 2005, Nature Biotechnology.

[21]  Falk Schreiber,et al.  Ranking of network elements based on functional substructures. , 2007, Journal of theoretical biology.

[22]  U. Brandes A faster algorithm for betweenness centrality , 2001 .

[23]  Kathryn E. Hentges,et al.  Defining the Role of Essential Genes in Human Disease , 2011, PloS one.

[24]  Taesung Park,et al.  Analysis of human disease genes in the context of gene essentiality. , 2008, Genomics.

[25]  Stuart Maudsley,et al.  Correction: VENNTURE–A Novel Venn Diagram Investigational Tool for Multiple Pharmacological Dataset Analysis , 2012, PLoS ONE.

[26]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[27]  R. Sharan,et al.  Protein networks in disease. , 2008, Genome research.

[28]  Pin Nie,et al.  Global characterization of interferon regulatory factor (IRF) genes in vertebrates: Glimpse of the diversification in evolution , 2010, BMC Immunology.

[29]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2002, Nucleic Acids Res..

[30]  K. Gurney,et al.  Network ‘Small-World-Ness’: A Quantitative Method for Determining Canonical Network Equivalence , 2008, PloS one.

[31]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[32]  Ting Chen,et al.  Further understanding human disease genes by comparing with housekeeping genes and other genes , 2006, BMC Genomics.

[33]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[34]  S. L. Wong,et al.  Towards a proteome-scale map of the human protein–protein interaction network , 2005, Nature.

[35]  Xinghuo Yu,et al.  Identification of Important Nodes in Directed Biological Networks: A Network Motif Approach , 2014, PloS one.

[36]  Francisco S. Roque,et al.  A large-scale analysis of tissue-specific pathology and gene expression of human disease genes and complexes , 2008, Proceedings of the National Academy of Sciences.

[37]  Stuart Maudsley,et al.  VENNTURE–A Novel Venn Diagram Investigational Tool for Multiple Pharmacological Dataset Analysis , 2012, PloS one.

[38]  Yicheng Zhang,et al.  Identifying influential nodes in complex networks , 2012 .

[39]  Albert-László Barabási,et al.  Hierarchical organization in complex networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[40]  Jun Yu,et al.  How many human genes can be defined as housekeeping with current expression data? , 2008, BMC Genomics.

[41]  Hanno Steen,et al.  Development of human protein reference database as an initial platform for approaching systems biology in humans. , 2003, Genome research.

[42]  Gary D. Bader,et al.  BIND-a data specification for storing and describing biomolecular interactions, molecular complexes and pathways , 2000, Bioinform..

[43]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[44]  Wen-Hsiung Li,et al.  Mammalian housekeeping genes evolve more slowly than tissue-specific genes. , 2004, Molecular biology and evolution.

[45]  Xinghuo Yu,et al.  Colored Noise Induced Bistable Switch in the Genetic Toggle Switch Systems , 2015, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[46]  Bor-Sen Chen,et al.  Robust Engineered Circuit Design Principles for Stochastic Biochemical Networks With Parameter Uncertainties and Disturbances , 2008, IEEE Transactions on Biomedical Circuits and Systems.

[47]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[48]  Zengrong Liu,et al.  Emergence of modularity and disassortativity in protein-protein interaction networks. , 2010, Chaos.

[49]  Xinghuo Yu,et al.  Duplication and Divergence Effect on Network Motifs in Undirected Bio-Molecular Networks , 2015, IEEE Transactions on Biomedical Circuits and Systems.

[50]  W. Kamps,et al.  Evidence Based Selection of Housekeeping Genes , 2007, PloS one.

[51]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[52]  Xinghuo Yu,et al.  Identification and Evolution of Structurally Dominant Nodes in Protein-Protein Interaction Networks , 2014, IEEE Transactions on Biomedical Circuits and Systems.

[53]  Fang-Xiang Wu Global and robust stability analysis of genetic regulatory networks with time-varying delays and parameter uncertainties. , 2011, IEEE transactions on biomedical circuits and systems.

[54]  Xinghuo Yu,et al.  Topological characterization of housekeeping genes in human protein-protein interaction network , 2014, 2014 8th International Conference on Systems Biology (ISB).

[55]  Razvan C. Bunescu,et al.  Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome , 2005, Genome Biology.

[56]  K. N. Chandrika,et al.  Analysis of the human protein interactome and comparison with yeast, worm and fly interaction datasets , 2006, Nature Genetics.

[57]  Yongjin Li,et al.  Discovering disease-genes by topological features in human protein-protein interaction network , 2006, Bioinform..

[58]  Hans-Werner Mewes,et al.  MPact: the MIPS protein interaction resource on yeast , 2005, Nucleic Acids Res..

[59]  Igor Jurisica,et al.  Online Predicted Human Interaction Database , 2005, Bioinform..

[60]  Satoru Miyano,et al.  Open source clustering software , 2004 .