PENETRATE: Personalized news recommendation using ensemble hierarchical clustering

Recommending online news articles has become a promising research direction as the Internet provides fast access to real-time information from multiple sources around the world. Many online readers have their own reading preference on news articles; however, a group of users might be interested in similar fascinating topics. It would be helpful to take into consideration the individual and group reading behavior simultaneously when recommending news items to online users. In this paper, we propose PENETRATE, a novel PErsonalized NEws recommendaTion framework using ensemble hieRArchical clusTEring to provide attractive recommendation results. Specifically, given a set of online readers, our approach initially separates readers into different groups based on their reading histories, where each user might be designated to several groups. Once a collection of newly-published news items is provided, we can easily construct a news hierarchy for each user group. When recommending news articles to a given user, the hierarchies of multiple user groups that the user belongs to are merged into an optimal one. Finally a list of news articles are selected from this optimal hierarchy based on the user’s personalized information, as the recommendation result. Extensive empirical experiments on a set of news articles collected from various popular news websites demonstrate the efficacy of our proposed approach.

[1]  E. N. Adams Consensus Techniques and the Comparison of Taxonomic Trees , 1972 .

[2]  Ata Kabán,et al.  On an equivalence between PLSI and LDA , 2003, SIGIR.

[3]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[4]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[5]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[6]  Alessandro Micarelli,et al.  User Profiles for Personalized Information Access , 2007, The Adaptive Web.

[7]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[8]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[9]  David M. Pennock,et al.  Categories and Subject Descriptors , 2001 .

[10]  Thore Graepel,et al.  WWW 2009 MADRID! Track: Data Mining / Session: Statistical Methods Matchbox: Large Scale Online Bayesian Recommendations , 2022 .

[11]  A. M. Madni,et al.  Recommender systems in e-commerce , 2014, 2014 World Automation Congress (WAC).

[12]  Guy Shani,et al.  An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..

[13]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[14]  Mi Zhang,et al.  Avoiding monotony: improving the diversity of recommendation lists , 2008, RecSys '08.

[15]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[16]  Balaji Padmanabhan,et al.  SCENE: a scalable two-stage personalized news recommendation system , 2011, SIGIR.

[17]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[18]  Wei Chu,et al.  Personalized recommendation on dynamic content using predictive bilinear models , 2009, WWW '09.

[19]  Hans-Peter Kriegel,et al.  Instance Selection Techniques for Memory-based Collaborative Filtering , 2002, SDM.

[20]  Michael J. Pazzani,et al.  A personal news agent that talks, learns and explains , 1999, AGENTS '99.

[21]  B. Jaumard,et al.  Cluster Analysis and Mathematical Programming , 2003 .

[22]  Hui Xiong,et al.  Transitive closure and metric inequality of weighted graphs: detecting protein interaction modules using cliques , 2006, Int. J. Data Min. Bioinform..

[23]  Shunzhi Zhu,et al.  Personalized News Recommendation: A Review and an Experimental Investigation , 2011, Journal of Computer Science and Technology.

[24]  Susan T. Dumais,et al.  Newsjunkie: providing personalized newsfeeds via analysis of information novelty , 2004, WWW '04.

[25]  Robin D. Burke,et al.  Hybrid Systems for Personalized Recommendations , 2003, ITWP.

[26]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[27]  Bernard De Baets,et al.  Algorithms for computing the min-transitive closure and associated partition tree of a symmetric fuzzy relation , 2004, Eur. J. Oper. Res..

[28]  Chris H. Q. Ding,et al.  Hierarchical Ensemble Clustering , 2010, 2010 IEEE International Conference on Data Mining.

[29]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[30]  Samir Khuller,et al.  The Budgeted Maximum Coverage Problem , 1999, Inf. Process. Lett..

[31]  Edward N. AdamsIII N-trees as nestings: Complexity, similarity, and consensus , 1986 .

[32]  Jiahui Liu,et al.  Personalized news recommendation based on click behavior , 2010, IUI '10.

[33]  Thomas Hofmann,et al.  Latent semantic models for collaborative filtering , 2004, TOIS.

[34]  Abhinandan Das,et al.  Google news personalization: scalable online collaborative filtering , 2007, WWW '07.

[35]  Peter Brusilovsky,et al.  Open user profiles for adaptive news systems: help or harm? , 2007, WWW '07.

[36]  Deepak Agarwal,et al.  Regression-based latent factor models , 2009, KDD.