A Link Prediction Approach to Recommendations in Large-Scale User-Generated Content Systems

Recommending interesting and relevant content from the vast repositories of User-Generated Content systems (UGCs) such as YouTube, Flickr and Digg is a significant challenge. Part of this challenge stems from the fact that classical collaborative filtering techniques - such as k-Nearest Neighbor - cannot be assumed to perform as well in UGCs as in other applications. Such technique has severe limitations regarding data sparsity and scalability that are unfitting for UGCs. In this paper, we employ adaptations of popular Link Prediction algorithms that were shown to be effective in massive online social networks for recommending items in UGCs. We evaluate these algorithms on a large dataset we collect from Flickr. Our results suggest that Link Prediction algorithms are a more scalable and accurate alternative to classical collaborative filtering in the context of UGCs. Moreover, our experiments show that the algorithms considering the immediate neighborhood of users in an user-item graph to recommend items outperform the algorithms that use the entire graph structure for the same. Finally, we find that, contrary to intuition, exploiting explicit social links among users in the recommendation algorithms improves only marginally their performance.

[1]  Jock Given,et al.  The wealth of networks: How social production transforms markets and freedom , 2007, Inf. Econ. Policy.

[2]  Pablo Rodriguez,et al.  I tube, you tube, everybody tubes: analyzing the world's largest user generated content video system , 2007, IMC '07.

[3]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[4]  Shankar Kumar,et al.  Video suggestion and discovery for youtube: taking random walks through the view graph , 2008, WWW.

[5]  Yochai Benkler,et al.  The wealth of networks: how social production transforms markets and freedom , 2006 .

[6]  Yong Tan,et al.  Examining the Diffusion of User-Generated Content in Online Social Networks , 2008 .

[7]  Purnamrita Sarkar,et al.  Theoretical Justification of Popular Link Prediction Heuristics , 2011, IJCAI.

[8]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[9]  Y. Benkler,et al.  The Wealth of Networks , 2008 .

[10]  D. Gefen,et al.  E-commerce: the role of familiarity and trust , 2000 .

[11]  Leo Katz,et al.  A new status index derived from sociometric analysis , 1953 .

[12]  Yin Zhang,et al.  Scalable proximity estimation and link prediction in online social networks , 2009, IMC '09.

[13]  Krishna P. Gummadi,et al.  A measurement-driven analysis of information propagation in the flickr social network , 2009, WWW '09.

[14]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[15]  Lada A. Adamic,et al.  Friends and neighbors on the Web , 2003, Soc. Networks.