"Strength Lies in Differences": Diversifying Friends for Recommendations through Subspace Clustering

Nowadays, WWW brings overwhelming variety of choices to consumers. Recommendation systems facilitate the selection by issuing recommendations to them. Recommendations for users, or groups, are determined by considering users similar to the users in question. Scanning the whole database for locating similar users, though, is expensive. Existing approaches build cluster models by employing full-dimensional clustering to find sets of similar users. As the datasets we deal with are high-dimensional and incomplete, full-dimensional clustering is not the best option. To this end, we explore the fault-tolerant subspace clustering approach. We extend the concept of fault tolerance to density-based subspace clustering, and to speed up our algorithms, we introduce the significance threshold for considering only promising dimensions for subspace extension. Moreover, as we potentially receive a multitude of users from subspace clustering, we propose a weighted ranking approach to refine the set of like-minded users. Our experiments on real movie datasets show that the diversification of the similar users that the subspace clustering approaches offer results in better recommendations compared to traditional collaborative filtering and full-dimensional clustering approaches.

[1]  Loriene Roy,et al.  Content-based book recommending using learning for text categorization , 1999, DL '00.

[2]  Hans-Peter Kriegel,et al.  A Framework for Modeling, Computing and Presenting Time-Aware Recommendations , 2013, Trans. Large Scale Data Knowl. Centered Syst..

[3]  Cong Yu,et al.  Group Recommendation: Semantics and Efficiency , 2009, Proc. VLDB Endow..

[4]  Hans-Peter Kriegel,et al.  Exploring subspace clustering for recommendations , 2014, SSDBM '14.

[5]  Jimeng Sun,et al.  Temporal recommendation on graphs via long- and short-term preference fusion , 2010, KDD.

[6]  Bradley N. Miller,et al.  GroupLens: applying collaborative filtering to Usenet news , 1997, CACM.

[7]  Anne-Marie Kermarrec,et al.  WHATSUP: A Decentralized Instant News Recommender , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[8]  Hans-Peter Kriegel,et al.  Density-Connected Subspace Clustering for High-Dimensional Data , 2004, SDM.

[9]  Huan Liu,et al.  Research Paper Recommender Systems: A Subspace Clustering Approach , 2005, WAIM.

[10]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[11]  Kai Li,et al.  Efficient k-nearest neighbor graph construction for generic similarity measures , 2011, WWW.

[12]  Hans-Peter Kriegel,et al.  Fast Group Recommendations by Applying User Clustering , 2012, ER.

[13]  Evaggelia Pitoura,et al.  "You May Also Like" Results in Relational Databases , 2009 .

[14]  Hans-Peter Kriegel,et al.  Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering , 2009, TKDD.

[15]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[16]  Georgia Koutrika,et al.  FlexRecs: expressing and combining flexible recommendations , 2009, SIGMOD Conference.

[17]  Xiaohui Li,et al.  Using Multidimensional Clustering Based Collaborative Filtering Approach Improving Recommendation Diversity , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[18]  Emmanuel Müller,et al.  Flexible Fault Tolerant Subspace Clustering for Data with Missing Values , 2011, 2011 IEEE 11th International Conference on Data Mining.