A Novel Item Cluster-Based Collaborative Filtering Recommendation System

Recent exponential expansion of users adopting to applications on the mobile internet, like e-commerce and social networks, warrants mining of the huge data collected from users’ past actions, for improving businesses and services. The core step for mining is to cluster the data meaningfully, conforming to the application. Social network data are structured, and graphical presentation reveals that structure. Therefore, graph clustering is an effective way to divulge the underlying structure in the data. For clustering, calculating similarity between a pair of vectors is the first step. The large dimension of the data, which is often noisy and sparse, makes distance measurement hard. In high dimension, most of the conventional distance metrics fail to work, as the data points are distributed over the surface of the high-dimensional hyper-space. The traditional concept of similarity, and nearest-neighbor does not hold. The variance of distance between any pair of points shrinks as the dimension increases. In this work, we investigate the efficacy of various similarity measures and clustering algorithms on high dimensional data. We experimented with a real-world high-dimensional matrix data, the ratings of movies by users. Clustering of movie items depends on a number of factors like movie genre, actors, directors, prominent acclaimed movie or an obscure one, etc. Different similarity measurements and clustering algorithms were experimented. Clustering results were evaluated by matching with known annotations of the movies. Finally, we proposed a novel recommendation algorithm based on item clustering. Its performance was evaluated with different distance metrics and clustering algorithms. Methods elaborated are applicable to other structured data generated in social network applications, or in biological investigations.

[1]  Xiangliang Zhang,et al.  Clustering Recommenders in Collaborative Filtering Using Explicit Trust Information , 2011, IFIPTM.

[2]  Charu C. Aggarwal,et al.  On the Surprising Behavior of Distance Metrics in High Dimensional Spaces , 2001, ICDT.

[3]  Ahmed Eldawy,et al.  LARS*: An Efficient and Scalable Location-Aware Recommender System , 2014, IEEE Transactions on Knowledge and Data Engineering.

[4]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[5]  Mohamed Nadif,et al.  Co-clustering , 2013, Encyclopedia of Database Systems.

[6]  Chun Chen,et al.  Using rich social media information for music recommendation via hypergraph model , 2011, TOMCCAP.

[7]  Yongmoo Suh,et al.  A new similarity function for selecting neighbors for each target item in collaborative filtering , 2013, Knowl. Based Syst..

[8]  Dunja Mladenic,et al.  The Role of Hubness in Clustering High-Dimensional Data , 2014, IEEE Trans. Knowl. Data Eng..

[9]  R. Marimont,et al.  Nearest Neighbour Searches and the Curse of Dimensionality , 1979 .

[10]  F. O. Isinkaye,et al.  Recommendation systems: Principles, methods and evaluation , 2015 .

[11]  Piet Hut,et al.  A hierarchical O(N log N) force-calculation algorithm , 1986, Nature.

[12]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[13]  Duen-Ren Liu,et al.  Hybrid approaches to product recommendation based on customer lifetime value and purchase preferences , 2005, J. Syst. Softw..

[14]  H. Edelsbrunner,et al.  Efficient algorithms for agglomerative hierarchical clustering methods , 1984 .

[15]  Johanne Saint-Charles,et al.  Predicting semantic preferences in a socio-semantic system with collaborative filtering: A case study , 2020, Int. J. Inf. Manag..

[16]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[17]  J. Bobadilla,et al.  Recommender systems survey , 2013, Knowl. Based Syst..

[18]  SongJie Gong,et al.  Mining User Interest Change for Improving Collaborative Filtering , 2008, 2008 Second International Symposium on Intelligent Information Technology Application.

[19]  Zafar Ali,et al.  Recommender Systems: Issues, Challenges, and Research Opportunities , 2016 .

[20]  Fei Sha,et al.  Similarity Learning for High-Dimensional Sparse Data , 2014, AISTATS.

[21]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[22]  Xiao Ma,et al.  An explicit trust and distrust clustering based collaborative filtering recommendation approach , 2017, Electron. Commer. Res. Appl..

[23]  Kyoung-jae Kim,et al.  A recommender system using GA K-means clustering in an online shopping market , 2008, Expert Syst. Appl..

[24]  Panagiotis Symeonidis,et al.  MusicBox: Personalized Music Recommendation Based on Cubic Analysis of Social Tags , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[25]  Enrique Herrera-Viedma,et al.  A multi-disciplinar recommender system to advice research resources in University Digital Libraries , 2009, Expert Syst. Appl..

[26]  Jeffery Kline,et al.  Properties of the d-dimensional earth mover's problem , 2019, Discret. Appl. Math..

[27]  John Riedl,et al.  The Tag Genome: Encoding Community Knowledge to Support Novel Interaction , 2012, TIIS.

[28]  Rafael Valencia-García,et al.  Solving the cold-start problem in recommender systems with social tags , 2010, Expert Syst. Appl..

[29]  Gérard Govaert,et al.  Co-Clustering: Models, Algorithms and Applications , 2013 .

[30]  Yong Wang,et al.  A Novel K-medoids clustering recommendation algorithm based on probability distribution for collaborative filtering , 2019, Knowl. Based Syst..

[31]  Brian Kulis,et al.  Metric Learning: A Survey , 2013, Found. Trends Mach. Learn..

[32]  Enrique Herrera-Viedma,et al.  A google wave-based fuzzy recommender system to disseminate information in University Digital Libraries 2.0 , 2011, Inf. Sci..