Embedding of Embedding (EOE): Joint Embedding for Coupled Heterogeneous Networks

Network embedding is increasingly employed to assist network analysis as it is effective to learn latent features that encode linkage information. Various network embedding methods have been proposed, but they are only designed for a single network scenario. In the era of big data, different types of related information can be fused together to form a coupled heterogeneous network, which consists of two different but related sub-networks connected by inter-network edges. In this scenario, the inter-network edges can act as comple- mentary information in the presence of intra-network ones. This complementary information is important because it can make latent features more comprehensive and accurate. And it is more important when the intra-network edges are ab- sent, which can be referred to as the cold-start problem. In this paper, we thus propose a method named embedding of embedding (EOE) for coupled heterogeneous networks. In the EOE, latent features encode not only intra-network edges, but also inter-network ones. To tackle the challenge of heterogeneities of two networks, the EOE incorporates a harmonious embedding matrix to further embed the em- beddings that only encode intra-network edges. Empirical experiments on a variety of real-world datasets demonstrate the EOE outperforms consistently single network embedding methods in applications including visualization, link prediction multi-class classification, and multi-label classification.

[1]  Huan Liu,et al.  Leveraging social media networks for classification , 2011, Data Mining and Knowledge Discovery.

[2]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[3]  Philip S. Yu,et al.  Collective Prediction of Multiple Types of Links in Heterogeneous Information Networks , 2014, 2014 IEEE International Conference on Data Mining.

[4]  Jon Kleinberg,et al.  The link prediction problem for social networks , 2003, CIKM '03.

[5]  Philip S. Yu,et al.  Unsupervised Feature Selection on Networks: A Generative View , 2016, AAAI.

[6]  Philip S. Yu,et al.  Efficient Partial Order Preserving Unsupervised Feature Selection on Networks , 2015, SDM.

[7]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[8]  Alexander J. Smola,et al.  Distributed large-scale natural graph factorization , 2013, WWW.

[9]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[10]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[11]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[12]  Philip S. Yu,et al.  Community detection with partially observable links and node attributes , 2016, 2016 IEEE International Conference on Big Data (Big Data).

[13]  L. Armijo Minimization of functions having Lipschitz continuous first partial derivatives. , 1966 .

[14]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[15]  B. Carpenter Lazy Sparse Stochastic Gradient Descent for Regularized Mutlinomial Logistic Regression , 2008 .

[16]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[17]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[18]  Huan Liu,et al.  Relational learning via latent social dimensions , 2009, KDD.

[19]  Geoff Holmes,et al.  MEKA: A Multi-label/Multi-target Extension to WEKA , 2016, J. Mach. Learn. Res..

[20]  Bin Chen,et al.  Assessing Drug Target Association Using Semantic Linked Data , 2012, PLoS Comput. Biol..

[21]  Huan Liu,et al.  Discovering Overlapping Groups in Social Media , 2010, 2010 IEEE International Conference on Data Mining.

[22]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[23]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[24]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[25]  James C. Bezdek,et al.  Some Notes on Alternating Optimization , 2002, AFSS.

[26]  Philip S. Yu,et al.  Nonlinear Joint Unsupervised Feature Selection , 2016, SDM.

[27]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[28]  Michael C. Hout,et al.  Multidimensional Scaling , 2003, Encyclopedic Dictionary of Archaeology.