Accelerated diffusion-based recommendation algorithm on tripartite graphs with GPU clusters

Exorbitant computation cost hinders the practical application of recommendation algorithm, especially in time-critical application scenario. Although experiments show that recommendation algorithm based on an integrated diffusion on user-item-tag tripartite graphs can significantly improve accuracy, diversification and novelty of recommendation, it is also very time-consuming. Therefore, a parallel solution is frequently needed to improve the performance of the algorithm. This paper explicitly presents the parallel implementation and optimizations of diffusion-based recommendation on weighted tripartite graphs algorithm using Compute Unified Device Architecture (CUDA) and related optimization solutions including shared memory, stream scheduling and GPU cluster optimization. Compared to the algorithm running on a single CPU core, the unoptimized GPU kernel can achieve 153.9 speedup on average with the input dataset consists of 30000 records on GTX 980. With shared memory applied, the time cost on memory access saves about 50% on dataset of 90000 records and with 2 way streams scheduling, the kernel's performance improves about 7% ~ 13%. Based on the optimized kernel, we evaluate the performance of the algorithm with customized socket communication mechanism on GPU clusters. And compared to a single GPU node, we achieve 7.55 speedup on clusters of 9 GPUs when recommending for 8000 users. Besides this, the speedup of GPU clusters is also 26.1 times of the speedup of our CPU clusters of 9 nodes and 1586.28 times of serial algorithm on one CPU core. It proves that GPU technology can dramatically improve the algorithm's performance.

[1]  Ruifeng Li,et al.  A social network-aware top-N recommender system using GPU , 2011, JCDL '11.

[2]  Yi-Cheng Zhang,et al.  Personalized Recommendation via Integrated Diffusion on User-Item-Tag Tripartite Graphs , 2009, ArXiv.

[3]  W. Marsden I and J , 2012 .

[4]  Rodrygo L. T. Santos,et al.  Context-Aware Event Recommendation in Event-based Social Networks , 2015, RecSys.

[5]  Ron Kohavi,et al.  Applications of Data Mining to Electronic Commerce , 2000, Springer US.

[6]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[7]  Hai Jiang,et al.  Accelerating NTRU Encryption with Graphics Processing Units , 2014, Int. J. Networked Distributed Comput..

[8]  Yi-Cheng Zhang,et al.  Effect of initial configuration on network-based recommendation , 2007, 0711.2506.

[9]  Bao-qun Yin,et al.  Power-law strength-degree correlation from resource-allocation dynamics on weighted networks. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[10]  John Riedl,et al.  E-Commerce Recommendation Applications , 2004, Data Mining and Knowledge Discovery.

[11]  Sophie Ahrens,et al.  Recommender Systems , 2012 .

[12]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[13]  Yi-Cheng Zhang,et al.  Bipartite network projection and personal recommendation. , 2007, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  Yuni Xia,et al.  GPU accelerated item-based collaborative filtering for big-data applications , 2013, 2013 IEEE International Conference on Big Data.

[15]  Suju Rajan,et al.  Beyond clicks: dwell time for personalization , 2014, RecSys '14.

[16]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[17]  Tikara Hosino,et al.  Solving k-Nearest Neighbor Problem on Multiple Graphics Processors , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.