Recommender Systems : A Subspace Clustering Approach

Researchers from the same lab often spend a considerable amount of time searching for published articles relevant to their current project. Despite having similar interests, they conduct independent, time consuming searches. While they may share the results afterwards, they are unable to leverage previous search results during the search process. We propose a research paper recommender system that avoids such time consuming searches by augmenting existing search engines with recommendations based on previous searches performed by others in the lab. Most existing recommender systems were developed for commercial domains with millions of users. The research paper domain has relatively few users compared to the large number of online research papers. The two major challenges with this type of data are the large number of dimensions and the sparseness of the data. The novel contribution of the paper is a scalable subspace clustering algorithm (SCuBA) that tackles these problems. Both synthetic and benchmark datasets are used to evaluate the clustering algorithm and to demonstrate that it performs better than the traditional collaborative filtering approaches when recommending research papers.

[1]  Philip S. Yu,et al.  Clustering through decision tree construction , 2000, CIKM '00.

[2]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[3]  Inderjit S. Dhillon,et al.  Information theoretic clustering of sparse cooccurrence data , 2003, Third IEEE International Conference on Data Mining.

[4]  David Maxwell Chickering,et al.  Dependency Networks for Inference, Collaborative Filtering, and Data Visualization , 2000, J. Mach. Learn. Res..

[5]  John Riedl,et al.  Analysis of recommendation algorithms for e-commerce , 2000, EC '00.

[6]  Sean M. McNee,et al.  On the recommending of citations for research papers , 2002, CSCW '02.

[7]  Ayhan Demiriz,et al.  Enhancing Product Recommender Systems on Sparse Binary Data , 2004, Data Mining and Knowledge Discovery.

[8]  Dimitrios Gunopulos,et al.  Subspace Clustering of High Dimensional Data , 2004, SDM.

[9]  Jaideep Srivastava,et al.  Automatic personalization based on Web usage mining , 2000, CACM.

[10]  Thomas Hofmann,et al.  Latent semantic models for collaborative filtering , 2004, TOIS.

[11]  Hsinchun Chen,et al.  Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering , 2004, TOIS.

[12]  George Karypis,et al.  Item-based top-N recommendation algorithms , 2004, TOIS.

[13]  Stuart E. Middleton,et al.  Ontological user profiling in recommender systems , 2004, TOIS.

[14]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[15]  Huan Liu,et al.  Subspace clustering for high dimensional data: a review , 2004, SKDD.

[16]  Mohammed J. Zaki,et al.  CLICK : Clustering Categorical Data using K-partite Maximal Cliques , 2004 .

[17]  C. Lee Giles,et al.  Scholarly publishing in the Internet age: a citation analysis of computer science literature , 2001, Inf. Process. Manag..

[18]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[19]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[20]  Yi Zhang,et al.  Entropy-based subspace clustering for mining numerical data , 1999, KDD '99.