Task-Guided Pair Embedding in Heterogeneous Network

Many real-world tasks solved by heterogeneous network embedding methods can be cast as modeling the likelihood of a pairwise relationship between two nodes. For example, the goal of author identification task is to model the likelihood of a paper being written by an author (paper-author pairwise relationship). Existing taskguided embedding methods are node-centric in that they simply measure the similarity between the node embeddings to compute the likelihood of a pairwise relationship between two nodes. However, we claim that for task-guided embeddings, it is crucial to focus on directly modeling the pairwise relationship. In this paper, we propose a novel task-guided pair embedding framework in heterogeneous network, called TaPEm, that directly models the relationship between a pair of nodes that are related to a specific task (e.g., paper-author relationship in author identification). To this end, we 1) propose to learn a pair embedding under the guidance of its associated context path, i.e., a sequence of nodes between the pair, and 2) devise the pair validity classifier to distinguish whether the pair is valid with respect to the specific task at hand. By introducing pair embeddings that capture the semantics behind the pairwise relationships, we are able to learn the fine-grained pairwise relationship between two nodes, which is paramount for task-guided embedding methods. Extensive experiments on author identification task demonstrate that TaPEm outperforms the state-of-the-art methods, especially for authors with few publication records.

[1]  Wei Shi,et al.  Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification , 2016, ACL.

[2]  Xiao Huang,et al.  Accelerated Attributed Network Embedding , 2017, SDM.

[3]  R. Blank The Effects of Double-Blind versus Single-Blind Reviewing: Experimental Evidence from The American Economic Review , 1991 .

[4]  Graham Cormode,et al.  Node Classification in Social Networks , 2011, Social Network Data Analytics.

[5]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[6]  Yizhou Sun,et al.  Task-Guided and Path-Augmented Heterogeneous Network Embedding for Author Identification , 2016, WSDM.

[7]  Le Song,et al.  GRAM: Graph-based Attention Model for Healthcare Representation Learning , 2016, KDD.

[8]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[9]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[10]  Chengqi Zhang,et al.  MetaGraph2Vec: Complex Semantic Path Augmented Heterogeneous Network Embedding , 2018, PAKDD.

[11]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[12]  Jian Pei,et al.  Asymmetric Transitivity Preserving Graph Embedding , 2016, KDD.

[13]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[14]  Yuxin Peng,et al.  The application of two-level attention models in deep convolutional neural network for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Yoshua Bengio,et al.  Attention-Based Models for Speech Recognition , 2015, NIPS.

[16]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[17]  Xiangnan He,et al.  Attentive Collaborative Filtering: Multimedia Recommendation with Item- and Component-Level Attention , 2017, SIGIR.

[18]  Yizhou Sun,et al.  Entity Embedding-Based Anomaly Detection for Heterogeneous Categorical Events , 2016, IJCAI.

[19]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[20]  Yixin Chen,et al.  Link Prediction Based on Graph Neural Networks , 2018, NeurIPS.

[21]  Jason Priem Scholarship: Beyond the paper , 2013, Nature.

[22]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[23]  Tomas Mikolov,et al.  Advances in Pre-Training Distributed Word Representations , 2017, LREC.

[24]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[25]  Jiawei Han,et al.  An Attention-based Collaboration Framework for Multi-View Network Representation Learning , 2017, CIKM.

[26]  Minyi Guo,et al.  SHINE: Signed Heterogeneous Information Network Embedding for Sentiment Link Prediction , 2017, WSDM.

[27]  Charu C. Aggarwal,et al.  Co-author Relationship Prediction in Heterogeneous Bibliographic Networks , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[28]  Xiao Huang,et al.  Label Informed Attributed Network Embedding , 2017, WSDM.

[29]  Nitesh V. Chawla,et al.  Camel: Content-Aware and Meta-path Augmented Metric Learning for Author Identification , 2018, WWW.

[30]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[31]  Philip S. Yu,et al.  Heterogeneous Information Network Embedding for Recommendation , 2017, IEEE Transactions on Knowledge and Data Engineering.

[32]  Jiawei Han,et al.  AspEm: Embedding Learning by Aspects in Heterogeneous Information Networks , 2018, SDM.

[33]  Ryan A. Rossi,et al.  Graph Classification using Structural Attention , 2018, KDD.

[34]  Yuriy Brun,et al.  Effectiveness of anonymization in double-blind review , 2017, Commun. ACM.

[35]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36]  Wang-Chien Lee,et al.  HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning , 2017, CIKM.

[37]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[38]  Philippe Cudré-Mauroux,et al.  Are Meta-Paths Necessary?: Revisiting Heterogeneous Graph Embeddings , 2018, CIKM.

[39]  Charu C. Aggarwal,et al.  Heterogeneous Network Embedding via Deep Architectures , 2015, KDD.

[40]  Palash Goyal,et al.  Capturing Edge Attributes via Network Embedding , 2018, IEEE Transactions on Computational Social Systems.

[41]  Dmitry Efimov,et al.  KDD Cup 2013 - author-paper identification challenge: second place team , 2013, KDD Cup '13.

[42]  Nitesh V. Chawla,et al.  metapath2vec: Scalable Representation Learning for Heterogeneous Networks , 2017, KDD.

[43]  Shou-De Lin,et al.  Combination of feature engineering and ranking models for paper-author identification in KDD Cup 2013 , 2013, KDD Cup '13.

[44]  Jiawei Han,et al.  Easing Embedding Learning by Comprehensive Transcription of Heterogeneous Information Networks , 2018, KDD.

[45]  Jian Pei,et al.  Community Preserving Network Embedding , 2017, AAAI.