Embedding Learning with Events in Heterogeneous Information Networks

In real-world applications, objects of multiple types are interconnected, forming <italic>Heterogeneous Information Networks</italic>. In such heterogeneous information networks, we make the key observation that many interactions happen due to some <italic>event</italic> and the objects in each event form a complete semantic unit. By taking advantage of such a property, we propose a generic framework called <italic><bold>H</bold>yper<bold>E</bold>dge- </italic><bold>B</bold><italic>ased</italic> <bold>E</bold><italic>mbedding</italic> (<sc>Hebe</sc>) to learn object embeddings with events in heterogeneous information networks, where a <italic>hyperedge</italic> encompasses the objects participating in one event. The <sc>Hebe</sc> framework models the proximity among objects in each event with two methods: (1) predicting a target object given other participating objects in the event, and (2) predicting if the event can be observed given all the participating objects. Since each hyperedge encapsulates more information of a given event, <sc>Hebe</sc> is robust to data sparseness and noise. In addition, <sc>Hebe</sc> is scalable when the data size spirals. Extensive experiments on large-scale real-world datasets show the efficacy and robustness of the proposed framework.

[1]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[2]  Charu C. Aggarwal,et al.  Heterogeneous Network Embedding via Deep Architectures , 2015, KDD.

[3]  Jiawei Han,et al.  Large-Scale Embedding Learning in Heterogeneous Event Data , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[4]  Nicholas Jing Yuan,et al.  Collaborative Knowledge Base Embedding for Recommender Systems , 2016, KDD.

[5]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[6]  Nicolas Le Roux,et al.  A latent factor model for highly multi-relational data , 2012, NIPS.

[7]  Daniel J. Kleitman,et al.  Combinatorics of Finite Sets. By Ian Anderson , 1989 .

[8]  Cheng Li,et al.  DeepGraph: Graph Structure Predicts Network Growth , 2016, ArXiv.

[9]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[10]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[11]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[12]  Daniel Jurafsky,et al.  Learning multi-faceted representations of individuals from heterogeneous evidence using neural networks , 2015, ArXiv.

[13]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[14]  Fei Wang,et al.  FEMA: flexible evolutionary multi-faceted analysis for dynamic behavioral pattern discovery , 2014, KDD.

[15]  Yehuda Koren,et al.  The BellKor Solution to the Netflix Grand Prize , 2009 .

[16]  Shaowen Wang,et al.  Regions, Periods, Activities: Uncovering Urban Dynamics via Cross-Modal Representation Learning , 2017, WWW.

[17]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[18]  Jian Pei,et al.  Asymmetric Transitivity Preserving Graph Embedding , 2016, KDD.

[19]  Huanbo Luan,et al.  Modeling Relation Paths for Representation Learning of Knowledge Bases , 2015, EMNLP.

[20]  Yihong Gong,et al.  Combining content and link for classification using matrix factorization , 2007, SIGIR.

[21]  Yizhou Sun,et al.  Entity Embedding-Based Anomaly Detection for Heterogeneous Categorical Events , 2016, IJCAI.

[22]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[23]  Gal Chechik,et al.  Euclidean Embedding of Co-occurrence Data , 2004, J. Mach. Learn. Res..

[24]  Zhen Wang,et al.  Knowledge Graph Embedding by Translating on Hyperplanes , 2014, AAAI.

[25]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[26]  Rebecca Willett,et al.  Hypergraph-Based Anomaly Detection of High-Dimensional Co-Occurrences , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Zhen Wang,et al.  Knowledge Graph and Text Jointly Embedding , 2014, EMNLP.

[28]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[29]  Omer Levy,et al.  word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method , 2014, ArXiv.

[30]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[31]  Philip S. Yu,et al.  Influence and similarity on heterogeneous networks , 2012, CIKM.

[32]  I. Anderson Combinatorics of Finite Sets , 1987 .

[33]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[34]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[35]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[36]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[37]  Lars Schmidt-Thieme,et al.  Pairwise interaction tensor factorization for personalized tag recommendation , 2010, WSDM '10.

[38]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[39]  Qiaozhu Mei,et al.  PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks , 2015, KDD.

[40]  Chen Huang,et al.  Local Similarity-Aware Deep Feature Embedding , 2016, NIPS.

[41]  Graham Cormode,et al.  Node Classification in Social Networks , 2011, Social Network Data Analytics.

[42]  Jiawei Han,et al.  Ranking-based classification of heterogeneous information networks , 2011, KDD.

[43]  Luis Mateus Rocha,et al.  Singular value decomposition and principal component analysis , 2003 .

[44]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[45]  Jure Leskovec,et al.  Tensor Spectral Clustering for Partitioning Higher-order Network Structures , 2015, SDM.

[46]  Tore Opsahl,et al.  Clustering in weighted networks , 2009, Soc. Networks.

[47]  Yizhou Sun,et al.  Mining Heterogeneous Information Networks: Principles and Methodologies , 2012, Mining Heterogeneous Information Networks: Principles and Methodologies.

[48]  Yee Whye Teh,et al.  A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.

[49]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[50]  Heng Ji,et al.  Exploring Context and Content Links in Social Media: A Latent Space Method , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.