Simplifying Weighted Heterogeneous Networks by Extracting h-Structure via s-Degree

In this study, we developed a method to extract the core structure of weighted heterogeneous networks by transforming the heterogeneous networks into homogeneous networks. Using the standardized z-score, we define the s-degree by summing all the z-scores of adjacent edges into base-nodes for a weighted heterogeneous network. Then, we rank all the s-degrees in decreasing order to obtain the core structure via the h-index of a base-homogeneous-network. After reducing all adjacent edges between the attribute nodes and base-nodes to the core structure, we obtain the heterogeneous core structure of the weighted network, which is called the h-structure. We find that the h-structure in a heterogeneous network contains less than 1% nodes and edges, which results in the construction of a highly effective simplification of a weighted heterogeneous network. Two practical cases, the citation network and the co-purchase network, were examined in this study.

[1]  Sergey N. Dorogovtsev,et al.  K-core Organization of Complex Networks , 2005, Physical review letters.

[2]  Hong Cheng,et al.  Graph Clustering Based on Structural/Attribute Similarities , 2009, Proc. VLDB Endow..

[3]  Michelangelo Ceci,et al.  Multi-type clustering and classification from heterogeneous networks , 2018, Inf. Sci..

[4]  Tao Zhou,et al.  The H-index of a network node and its relation to degree and coreness , 2016, Nature Communications.

[5]  H. Eugene Stanley,et al.  Extracting h-Backbone as a Core Structure in Weighted Networks , 2018, Scientific Reports.

[6]  Yizhou Sun,et al.  Mining Heterogeneous Information Networks: Principles and Methodologies , 2012, Mining Heterogeneous Information Networks: Principles and Methodologies.

[7]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[8]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[9]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[10]  Jure Leskovec,et al.  The dynamics of viral marketing , 2005, EC '06.

[11]  Wei Wang,et al.  Top-k similarity search in heterogeneous information networks with x-star network schema , 2015, Expert Syst. Appl..

[12]  Hadi Shakibian,et al.  Mutual information model for link prediction in heterogeneous complex networks , 2017, Scientific Reports.

[13]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[14]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[15]  Edward A. Fox,et al.  SimFusion: measuring similarity using unified relationship matrix , 2005, SIGIR '05.

[16]  András Schubert,et al.  Hirsch-type indices for characterizing networks , 2009, Scientometrics.

[17]  Philip S. Yu,et al.  Top-k Similarity Join in Heterogeneous Information Networks , 2015, IEEE Transactions on Knowledge and Data Engineering.

[18]  Jiawei Han,et al.  Ranking-based classification of heterogeneous information networks , 2011, KDD.

[19]  Ronald Rousseau,et al.  h-Degree as a basic measure in weighted networks , 2011, J. Informetrics.

[20]  S. Strogatz Exploring complex networks , 2001, Nature.

[21]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[22]  Frank Schweitzer,et al.  A k-shell decomposition method for weighted networks , 2012, ArXiv.

[23]  Yizhou Sun,et al.  Meta-Path-Based Search and Mining in Heterogeneous Information Networks , 2013 .

[24]  Jiang Li,et al.  Abstracting the core subnet of weighted networks based on link strengths , 2014, J. Assoc. Inf. Sci. Technol..

[25]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[26]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[27]  Philip S. Yu,et al.  A Survey of Heterogeneous Information Network Analysis , 2015, IEEE Transactions on Knowledge and Data Engineering.

[28]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[29]  Cassidy R. Sugimoto,et al.  P-Rank: An indicator measuring prestige in heterogeneous scholarly networks , 2011, J. Assoc. Inf. Sci. Technol..

[30]  Wei Wang,et al.  HeteRank: A general similarity measure in heterogeneous information networks by integrating multi-type relationships , 2018, Inf. Sci..

[31]  Cassidy R. Sugimoto,et al.  Institutional interactions: Exploring social, cognitive, and geographic relationships between institutions as demonstrated through citation networks , 2011, J. Assoc. Inf. Sci. Technol..

[32]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[33]  Star X. Zhao,et al.  Exploring the directed h-degree in directed weighted networks , 2012, J. Informetrics.

[34]  Ying Ding,et al.  Applying weighted PageRank to author citation networks , 2011, J. Assoc. Inf. Sci. Technol..

[35]  Dalibor Fiala,et al.  PageRank variants in the evaluation of citation networks , 2014, J. Informetrics.

[36]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..