Dataset Recommendation via Variational Graph Autoencoder

This paper targets on designing a query-based dataset recommendation system, which accepts a query denoting a user’s research interest as a set of research papers and returns a list of recommended datasets that are ranked by the potential usefulness for the user’s research need. The motivation of building such a system is to save users from spending time on heavy literature review work to find usable datasets.We start by constructing a two-layer network: one layer of citation network, and the other layer of datasets, connected to the firstlayer papers in which they were used. A query highlights a set of papers in the citation layer. However, answering the query as a naive retrieval of datasets linked with these highlighted papers excludes other semantically relevant datasets, which widely exist several hops away from the queried papers. We propose to learn representations of research papers and datasets in the two-layer network using heterogeneous variational graph autoencoder, and then compute the relevance of the query to the dataset candidates based on the learned representations. Our ranked datasets shown in extensive evaluation results are validated to be more truly relevant than those obtained by naive retrieval methods and adoptions of existing related solutions.

[1]  Ruslan Salakhutdinov,et al.  Revisiting Semi-Supervised Learning with Graph Embeddings , 2016, ICML.

[2]  Xiangnan He,et al.  Attributed Social Network Embedding , 2017, IEEE Transactions on Knowledge and Data Engineering.

[3]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[4]  Stephan Günnemann,et al.  Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking , 2017, ICLR.

[5]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[6]  Yuan He,et al.  Graph Neural Networks for Social Recommendation , 2019, WWW.

[7]  Nitesh V. Chawla,et al.  Camel: Content-Aware and Meta-path Augmented Metric Learning for Author Identification , 2018, WWW.

[8]  Tat-Seng Chua,et al.  Neural Collaborative Filtering , 2017, WWW.

[9]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[10]  Charu C. Aggarwal,et al.  Heterogeneous Network Embedding via Deep Architectures , 2015, KDD.

[11]  James She,et al.  Collaborative Variational Autoencoder for Recommender Systems , 2017, KDD.

[12]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[13]  Ahmed Eldawy,et al.  LARS: A Location-Aware Recommender System , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[14]  Yi-Hsuan Yang,et al.  Query-based Music Recommendations via Preference Embedding , 2016, RecSys.

[15]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[16]  Kevin Chen-Chuan Chang,et al.  Semantic Proximity Search on Heterogeneous Graph by Proximity Embedding , 2017, AAAI.

[17]  Elena Simperl,et al.  Dataset search: a survey , 2019, The VLDB Journal.

[18]  Gediminas Adomavicius,et al.  Context-aware recommender systems , 2008, RecSys '08.

[19]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[20]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[21]  Deli Zhao,et al.  Network Representation Learning with Rich Text Information , 2015, IJCAI.

[22]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[23]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[24]  Jiawei Han,et al.  ClusCite: effective citation recommendation by information network-based clustering , 2014, KDD.

[25]  Philip S. Yu,et al.  HeteRecom: a semantic-based recommendation system in heterogeneous networks , 2012, KDD.

[26]  Xiangliang Zhang,et al.  Delve: A Dataset-Driven Scholarly Search and Analysis System , 2017, SKDD.

[27]  Nitesh V. Chawla,et al.  metapath2vec: Scalable Representation Learning for Heterogeneous Networks , 2017, KDD.

[28]  Guihai Chen,et al.  Dual Graph Attention Networks for Deep Latent Representation of Multifaceted Social Effects in Recommender Systems , 2019, WWW.

[29]  Dit-Yan Yeung,et al.  Collaborative Deep Learning for Recommender Systems , 2014, KDD.

[30]  Tat-Seng Chua,et al.  Item Silk Road: Recommending Items from Information Domains to Social Users , 2017, SIGIR.

[31]  Sheng Li,et al.  Deep Collaborative Filtering via Marginalized Denoising Auto-encoder , 2015, CIKM.

[32]  Jiajun Bu,et al.  ANRL: Attributed Network Representation Learning via Deep Neural Networks , 2018, IJCAI.

[33]  Xiangliang Zhang,et al.  Co-Embedding Attributed Networks , 2019, WSDM.

[34]  Wang-Chien Lee,et al.  HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning , 2017, CIKM.

[35]  Nadia Magnenat-Thalmann,et al.  Time-aware point-of-interest recommendation , 2013, SIGIR.

[36]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[37]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[38]  Chandra Bhagavatula,et al.  Content-Based Citation Recommendation , 2018, NAACL.

[39]  Svetha Venkatesh,et al.  Column Networks for Collective Classification , 2016, AAAI.

[40]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[41]  Philip S. Yu,et al.  Heterogeneous Information Network Embedding for Recommendation , 2017, IEEE Transactions on Knowledge and Data Engineering.