Graph DNA: Deep Neighborhood Aware Graph Encoding for Collaborative Filtering

In this paper, we consider recommender systems with side information in the form of graphs. Existing collaborative filtering algorithms mainly utilize only immediate neighborhood information and have a hard time taking advantage of deeper neighborhoods beyond 1-2 hops. The main caveat of exploiting deeper graph information is the rapidly growing time and space complexity when incorporating information from these neighborhoods. In this paper, we propose using Graph DNA, a novel Deep Neighborhood Aware graph encoding algorithm, for exploiting deeper neighborhood information. DNA encoding computes approximate deep neighborhood information in linear time using Bloom filters, a space-efficient probabilistic data structure and results in a per-node encoding that is logarithmic in the number of nodes in the graph. It can be used in conjunction with both feature-based and graph-regularization-based collaborative filtering algorithms. Graph DNA has the advantages of being memory and time efficient and providing additional regularization when compared to directly using higher order graph information. We conduct experiments on real-world datasets, showing graph DNA can be easily used with 4 popular collaborative filtering algorithms and consistently leads to a performance boost with little computational and memory overhead.

[1]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[2]  Guillermo Sapiro,et al.  Kernelized Probabilistic Matrix Factorization: Exploiting Graphs and Side Information , 2012, SDM.

[3]  Geoffrey J. Gordon,et al.  Relational learning via collective matrix factorization , 2008, KDD.

[4]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[5]  Moustapha Cissé,et al.  Robust Bloom Filters for Large MultiLabel Classification Tasks , 2013, NIPS.

[6]  Cho-Jui Hsieh,et al.  Large-scale Collaborative Ranking in Near-Linear Time , 2017, KDD.

[7]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[8]  Cho-Jui Hsieh,et al.  Stochastic Shared Embeddings: Data-driven Regularization of Embedding Layers , 2019, NeurIPS.

[9]  Cho-Jui Hsieh,et al.  SQL-Rank: A Listwise Approach to Collaborative Ranking , 2018, ICML.

[10]  Marco Gori,et al.  ItemRank: A Random-Walk Based Scoring Algorithm for Recommender Engines , 2007, IJCAI.

[11]  Max Welling,et al.  Graph Convolutional Matrix Completion , 2017, ArXiv.

[12]  Komal Shringare,et al.  Apache Hadoop Goes Realtime at Facebook , 2015 .

[13]  Xavier Bresson,et al.  Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks , 2017, NIPS.

[14]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[15]  Alexandros Karatzoglou,et al.  Getting Deep Recommenders Fit: Bloom Embeddings for Sparse Binary Input/Output Networks , 2017, RecSys.

[16]  V. Mirrokni,et al.  A recommender system based on local random walks and spectral methods , 2007, WebKDD/SNA-KDD '07.

[17]  Cho-Jui Hsieh,et al.  Temporal Collaborative Ranking Via Personalized Transformer , 2019, ArXiv.

[18]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[19]  Lars Schmidt-Thieme,et al.  BPR: Bayesian Personalized Ranking from Implicit Feedback , 2009, UAI.

[20]  Chih-Jen Lin,et al.  A Unified Algorithm for One-Cass Structured Matrix Factorization with Side Information , 2017, AAAI.

[21]  Guy Shani,et al.  Mining recommendations from the web , 2008, RecSys '08.

[22]  Nagarajan Natarajan,et al.  PU Learning for Matrix Completion , 2014, ICML.

[23]  David M. Blei,et al.  Factorization Meets the Item Embedding: Regularizing Matrix Factorization with Item Co-occurrence , 2016, RecSys.

[24]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[25]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[26]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[27]  Le Song,et al.  Stochastic Training of Graph Convolutional Networks with Variance Reduction , 2017, ICML.

[28]  Martin Ester,et al.  TrustWalker: a random walk model for combining trust-based and item-based recommendation , 2009, KDD.

[29]  Anita Shinde,et al.  User Based Collaborative Filtering Using Bloom Filter with MapReduce , 2016 .

[30]  K. Joag-dev,et al.  Negative Association of Random Variables with Applications , 1983 .

[31]  Burton H. Bloom,et al.  Space/time trade-offs in hash coding with allowable errors , 1970, CACM.

[32]  Johannes Gehrke,et al.  Edge-Weighted Personalized PageRank: Breaking A Decade-Old Performance Barrier , 2015, KDD.

[33]  Andrei Broder,et al.  Network Applications of Bloom Filters: A Survey , 2004, Internet Math..

[34]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[35]  Pradeep Ravikumar,et al.  Collaborative Filtering with Graph Information: Consistency and Scalable Methods , 2015, NIPS.

[36]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[37]  Chao Liu,et al.  Recommender systems with social regularization , 2011, WSDM '11.

[38]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[39]  Raja Chiky,et al.  An item/user representation for recommender systems based on bloom filters , 2016, 2016 IEEE Tenth International Conference on Research Challenges in Information Science (RCIS).

[40]  Qiongkai Xu,et al.  GraRep: Learning Graph Representations with Global Structural Information , 2015, CIKM.

[41]  Jure Leskovec,et al.  Graph Convolutional Neural Networks for Web-Scale Recommender Systems , 2018, KDD.

[42]  Xiaojun Wu,et al.  Graph Regularized Nonnegative Matrix Factorization for Data Representation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[44]  John Langford,et al.  Hash Kernels for Structured Data , 2009, J. Mach. Learn. Res..

[45]  Desh Ranjan,et al.  Balls and bins: A study in negative dependence , 1996, Random Struct. Algorithms.

[46]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[47]  David Hutchison,et al.  Scalable Bloom Filters , 2007, Inf. Process. Lett..