Structural Hole Spanner in HumanNet Identifies Disease Gene and Drug targets

Traditional methods of analyzing genes mainly focus on a single gene. However, they are time consuming, costly, and neglect the interaction between genes. To overcome the shortcomings of traditional methods, many computational techniques have been proposed for gene analysis. Computational methods posit that the interaction between genes forms a network. By analyzing the topological characteristics of the network, a type of node, “hubs,” is identified. Only a small fraction of nodes is called “hubs,” but they interact with many partners in the network. Several methods are used to find “hubs” in a network. However, they are not comprehensive. Nevertheless, searching for more information about genes in a network remains challenging. In this paper, we integrate HumanNet and a framework to find a structural hole (SH) spanner. We classify genes into three classes (SH, normal, and non-SH). We also classify genes into three classes according to degree and betweenness. Enrichment analysis reveals that the SH genes are enriched for an essential one. Furthermore, enrichment analyses of disease mutations, viruses, and drugs consistently show that the SH genes are preferred targets and are important in cellular state transition. Functional enrichment shows that the biological function that the SH genes over represent is related to cell function. We demonstrate that by finding the structural hole spanner in HumanNet, we can find the SH genes that are essential and indispensable in cellular life. In addition, we identify the SH genes that are key players in cellular state transition.

[1]  A. van de Rijt,et al.  Dynamics of Networks if Everyone Strives for Structural Holes1 , 2008, American Journal of Sociology.

[2]  Steven B. Andrews,et al.  Structural Holes: The Social Structure of Competition , 1995, The SAGE Encyclopedia of Research Design.

[3]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[4]  J. Ellenberg,et al.  The quantitative proteome of a human cell line , 2011, Molecular systems biology.

[5]  R. König,et al.  Global Analysis of Host-Pathogen Interactions that Regulate Early-Stage HIV-1 Replication , 2008, Cell.

[6]  Vincent Lotteau,et al.  Viruses and Interactomes in Translation* , 2012, Molecular & Cellular Proteomics.

[7]  Chung-Yen Lin,et al.  Hubba: hub objects analyzer—a framework of interactome hubs identification for network biology , 2008, Nucleic Acids Res..

[8]  Gabriele Ausiello,et al.  MINT: the Molecular INTeraction database , 2006, Nucleic Acids Res..

[9]  Jie Tang,et al.  Mining structural hole spanners through information diffusion in social networks , 2013, WWW.

[10]  Danielle S Bassett,et al.  Cognitive fitness of cost-efficient brain functional networks , 2009, Proceedings of the National Academy of Sciences.

[11]  Dong Xu,et al.  Understanding protein dispensability through machine-learning analysis of high-throughput data , 2005, Bioinform..

[12]  Jie Tang,et al.  Inferring social ties across heterogenous networks , 2012, WSDM '12.

[13]  Philip S. Yu,et al.  Joint Community and Structural Hole Spanner Detection via Harmonic Modularity , 2016, KDD.

[14]  John H. Morris,et al.  Global landscape of HIV–human protein complexes , 2011, Nature.

[15]  Divya Mistry,et al.  DiffSLC: A graph centrality method to detect essential proteins of a protein-protein interaction network , 2017, PloS one.

[16]  K. Goh,et al.  Betweenness centrality correlation in social networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Mila Nikolova,et al.  Analysis of Half-Quadratic Minimization Methods for Signal and Image Recovery , 2005, SIAM J. Sci. Comput..

[18]  M. Meyerson,et al.  Recurrent Hemizygous Deletions in Cancers May Optimize Proliferative Potential , 2012, Science.

[19]  Beom Jun Kim,et al.  Attack vulnerability of complex networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[20]  O. Sporns,et al.  Rich-Club Organization of the Human Connectome , 2011, The Journal of Neuroscience.

[21]  Md. Rafiul Hassan,et al.  Network topology measures for identifying disease-gene association in breast cancer , 2016, BMC Bioinformatics.

[22]  A. Elofsson,et al.  What properties characterize the hub proteins of the protein-protein interaction network of Saccharomyces cerevisiae? , 2006, Genome Biology.

[23]  Mark Gerstein,et al.  The Importance of Bottlenecks in Protein Networks: Correlation with Gene Essentiality and Expression Dynamics , 2007, PLoS Comput. Biol..

[24]  Kuan-Teh Jeang,et al.  A Genome-wide Short Hairpin RNA Screening of Jurkat T-cells for Human Proteins Contributing to Productive HIV-1 Replication* , 2009, The Journal of Biological Chemistry.

[25]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[26]  J. Church Identification of Host Proteins Required for HIV Infection Through a Functional Genomic Screen , 2008, Pediatrics.

[27]  Jianzhi Zhang,et al.  Why Do Hubs Tend to Be Essential in Protein Networks? , 2006, PLoS genetics.

[28]  Cathy H. Wu,et al.  Oncogenic fusion protein EWS-FLI1 is a network hub that regulates alternative splicing , 2015, Proceedings of the National Academy of Sciences.

[29]  Alpan Raval,et al.  Identifying Hubs in Protein Interaction Networks , 2009, PloS one.

[30]  D. Xie,et al.  RACK1, a versatile hub in cancer , 2014, Oncogene.

[31]  James T. Webber,et al.  Interpreting cancer genomes using systematic host perturbations by tumour virus proteins , 2012, Nature.

[32]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[33]  Yan Lin,et al.  DEG 10, an update of the database of essential genes that includes both protein-coding genes and noncoding genomic elements , 2013, Nucleic Acids Res..

[34]  Amy S. Espeseth,et al.  Genome-scale RNAi screen for host factors required for HIV replication. , 2008, Cell host & microbe.

[35]  Christian Gautier,et al.  VirHostNet: a knowledge base for the management and the analysis of proteome-wide virus–host interaction networks , 2008, Nucleic Acids Res..

[36]  E. Marcotte,et al.  Prioritizing candidate disease genes by network-based boosting of genome-wide association data. , 2011, Genome research.

[37]  Sanjeev Goyal,et al.  Structural holes in social networks , 2007, J. Econ. Theory.

[38]  Peer Bork,et al.  OGEE: an online gene essentiality database , 2011, Nucleic Acids Res..