AspEm: Embedding Learning by Aspects in Heterogeneous Information Networks

Heterogeneous information networks (HINs) are ubiquitous in real-world applications. Due to the heterogeneity in HINs, the typed edges may not fully align with each other. In order to capture the semantic subtlety, we propose the concept of aspects with each aspect being a unit representing one underlying semantic facet. Meanwhile, network embedding has emerged as a powerful method for learning network representation, where the learned embedding can be used as features in various downstream applications. Therefore, we are motivated to propose a novel embedding learning framework-ASPEM-to preserve the semantic information in HINs based on multiple aspects. Instead of preserving information of the network in one semantic space, ASPEM encapsulates information regarding each aspect individually. In order to select aspects for embedding purpose, we further devise a solution for ASPEM based on dataset-wide statistics. To corroborate the efficacy of ASPEM, we conducted experiments on two real-words datasets with two types of applications-classification and link prediction. Experiment results demonstrate that ASPEM can outperform baseline network embedding learning methods by considering multiple aspects, where the aspects can be selected from the given HIN in an unsupervised manner.

[1]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[2]  Shaowen Wang,et al.  Regions, Periods, Activities: Uncovering Urban Dynamics via Cross-Modal Representation Learning , 2017, WWW.

[3]  Jiawei Han,et al.  Large-Scale Embedding Learning in Heterogeneous Event Data , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[4]  Xiangnan Kong,et al.  Heterogeneous network embedding enabling accurate disease association predictions , 2019, BMC Medical Genomics.

[5]  Nitesh V. Chawla,et al.  metapath2vec: Scalable Representation Learning for Heterogeneous Networks , 2017, KDD.

[6]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[7]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[8]  Sanjeev Arora,et al.  Linear Algebraic Structure of Word Senses, with Applications to Polysemy , 2016, TACL.

[9]  Chris Dyer,et al.  Ontologically Grounded Multi-sense Representation Learning for Semantic Vector Space Models , 2015, NAACL.

[10]  Sami Abu-El-Haija,et al.  Learning Edge Representations via Low-Rank Asymmetric Projections , 2017, CIKM.

[11]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[12]  Jiawei Luo,et al.  WMGHMDA: a novel weighted meta-graph-based model for predicting human microbe-disease association on heterogeneous information network , 2019, BMC Bioinformatics.

[13]  Yizhou Sun,et al.  Mining heterogeneous information networks: a structural analysis approach , 2013, SKDD.

[14]  Daniel R. Figueiredo,et al.  struc2vec: Learning Node Representations from Structural Identity , 2017, KDD.

[15]  Ivan Titov,et al.  Bilingual Learning of Multi-sense Embeddings with Discrete Autoencoders , 2016, HLT-NAACL.

[16]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[17]  Souvik Ghosh,et al.  Dynamics of Large Multi-View Social Networks: Synergy, Cannibalization and Cross-View Interplay , 2016, KDD.

[18]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[19]  C. Ballantine On the Hadamard product , 1968 .

[20]  Chengqi Zhang,et al.  Tri-Party Deep Network Representation , 2016, IJCAI.

[21]  Jian Pei,et al.  Asymmetric Transitivity Preserving Graph Embedding , 2016, KDD.

[22]  Po-Wei Chan,et al.  PReP: Path-Based Relevance from a Probabilistic Perspective in Heterogeneous Information Networks , 2017, KDD.

[23]  Yizhou Sun,et al.  Task-Guided and Path-Augmented Heterogeneous Network Embedding for Author Identification , 2016, WSDM.

[24]  Jiawei Han,et al.  Mining Query-Based Subnetwork Outliers in Heterogeneous Information Networks , 2014, 2014 IEEE International Conference on Data Mining.

[25]  Charu C. Aggarwal,et al.  Heterogeneous Network Embedding via Deep Architectures , 2015, KDD.

[26]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[27]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[28]  Qiaozhu Mei,et al.  PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks , 2015, KDD.

[29]  Liyuan Liu,et al.  TrioVecEvent: Embedding-Based Online Local Event Detection in Geo-Tagged Tweet Streams , 2017, KDD.

[30]  Gene H. Golub,et al.  Singular value decomposition and least squares solutions , 1970, Milestones in Matrix Computation.

[31]  Andrew McCallum,et al.  Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space , 2014, EMNLP.

[32]  Yizhou Sun,et al.  Personalized entity recommendation: a heterogeneous information network approach , 2014, WSDM.

[33]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[34]  Philip S. Yu,et al.  A Survey of Heterogeneous Information Network Analysis , 2015, IEEE Transactions on Knowledge and Data Engineering.

[35]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[36]  Kevin Chen-Chuan Chang,et al.  Semantic Proximity Search on Heterogeneous Graph by Proximity Embedding , 2017, AAAI.