Content-based Recommender Systems: State of the Art and Trends

Recommender systems have the effect of guiding users in a personal- ized way to interesting objects in a large space of possible options. Content-based recommendation systems try to recommend items similar to those a given user has liked in the past. Indeed, the basic process performed by a content-based recom- mender consists in matching up the attributes of a user profile in which preferences and interests are stored, with the attributes of a content object (item), in order to recommend to the user new interesting items. This chapter provides an overview of content-based recommender systems, with the aim of imposing a degree of order on the diversity of the different aspects involved in their design and implementation. The first part of the chapter presents the basic concepts and terminology of content- based recommender systems, a high level architecture, and their main advantages and drawbacks. The second part of the chapter provides a review of the state of the art of systems adopted in several application domains, by thoroughly describ- ing both classical and advanced techniques for representing items and user profiles. The most widely adopted techniques for learning user profiles are also presented. The last part of the chapter discusses trends and future research which might lead towards the next generation of systems, by describing the role of User Generated Content as a way for taking into account evolving vocabularies, and the challenge of feeding users with serendipitous recommendations, that is to say surprisingly interesting items that they might not have otherwise discovered.

[1]  Nigel Shadbolt,et al.  A Study of User Profile Generation from Folksonomies , 2008, SWKM.

[2]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[3]  Greg Linden,et al.  Amazon . com Recommendations Item-to-Item Collaborative Filtering , 2001 .

[4]  Alexandros Moukas Amalthaea Information Discovery and Filtering Using a Multiagent Evolving Ecosystem , 1997, Appl. Artif. Intell..

[5]  Michael J. Pazzani,et al.  A hybrid user model for news story classification , 1999 .

[6]  Pattie Maes,et al.  Evolving agents for personalized information filtering , 1993, Proceedings of 9th IEEE Conference on Artificial Intelligence for Applications.

[7]  Hae-Chang Rim,et al.  Some Effective Techniques for Naive Bayes Text Classification , 2006, IEEE Transactions on Knowledge and Data Engineering.

[8]  Pasquale Lops,et al.  Introducing Serendipity in a Content-Based Recommender System , 2008, 2008 Eighth International Conference on Hybrid Intelligent Systems.

[9]  Loriene Roy,et al.  Content-based book recommending using learning for text categorization , 1999, DL '00.

[10]  Wolfgang Nejdl,et al.  The Benefit of Using Tag-Based Profiles , 2007 .

[11]  Pasquale Lops,et al.  A content-collaborative recommender that exploits WordNet-based user profiles for neighborhood formation , 2007, User Modeling and User-Adapted Interaction.

[12]  Andrew McCallum,et al.  A comparison of event models for naive bayes text classification , 1998, AAAI 1998.

[13]  Paul Resnick,et al.  Recommender systems , 1997, CACM.

[14]  Irena Koprinska,et al.  INTIMATE: a Web-based movie recommender using text categorization , 2003, Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003).

[15]  Pattie Maes,et al.  Social information filtering: algorithms for automating “word of mouth” , 1995, CHI '95.

[16]  Pasquale Lops,et al.  Knowledge infusion into content-based recommender systems , 2009, RecSys '09.

[17]  Henry Lieberman,et al.  Letizia: An Agent That Assists Web Browsing , 1995, IJCAI.

[18]  Michael J. Pazzani,et al.  Learning and Revising User Profiles: The Identification of Interesting Web Sites , 1997, Machine Learning.

[19]  F. W. Roush,et al.  Topics in the theory of voting, The UMAP expository monograph series : Philip D. Straffin, Jr. Boston, Birkhauser, 1980, US $5.00 , 1982 .

[20]  Douglas B. Terry,et al.  Using collaborative filtering to weave an information tapestry , 1992, CACM.

[21]  Daniel Billsus,et al.  Learning Probabilistic User Models , 1998 .

[22]  Colm O'Riordan,et al.  Profiling with the INFOrmer Text Filtering Agent , 1997, J. Univers. Comput. Sci..

[23]  Ahmad M. Ahmad Wasfi Collecting user access patterns for building user profiles and collaborative filtering , 1998, IUI '99.

[24]  Vittorio Loreto,et al.  Folksonomies, the semantic web, and movie recommendation , 2007 .

[25]  Pasquale Lops,et al.  Combining Learning and Word Sense Disambiguation for Intelligent User Profiling , 2007, IJCAI.

[26]  Nan Du,et al.  Improved recommendation based on collaborative tagging behaviors , 2008, IUI '08.

[27]  Judy Kay,et al.  Proceedings of the seventh international conference on User modeling , 1999 .

[28]  David D. Lewis,et al.  A comparison of two learning algorithms for text categorization , 1994 .

[29]  Gary Boone,et al.  Concept features in Re:Agent, an intelligent Email agent , 1998, AGENTS '98.

[30]  Carlo Strapparava,et al.  Experiments in Word Domain Disambiguation for Parallel Texts , 2000, ACL 2000.

[31]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[32]  John K. Debenham,et al.  Informed Recommender: Basing Recommendations on Consumer Product Reviews , 2007, IEEE Intelligent Systems.

[33]  Tereza Iofciu,et al.  Finding Communities of Practice from User Profiles Based on Folksonomies , 2006, EC-TEL Workshops.

[34]  Pasquale Lops,et al.  Integrating tags in a semantic content-based recommender , 2008, RecSys '08.

[35]  Evgeniy Gabrilovich,et al.  Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis , 2007, IJCAI.

[36]  Dunja Mladenic,et al.  Text-learning and related intelligent agents: a survey , 1999, IEEE Intell. Syst..

[37]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[38]  Nick Antonopoulos,et al.  CinemaScreen recommender agent: combining collaborative and content-based filtering , 2006, IEEE Intelligent Systems.

[39]  Analía Amandi,et al.  Hybrid Content and Tag-based Profiles for Recommendation in Collaborative Tagging Systems , 2008, 2008 Latin American Web Conference.

[40]  Michael J. Pazzani,et al.  User Modeling for Adaptive News Access , 2000, User Modeling and User-Adapted Interaction.

[41]  Òscar Celma,et al.  Foafing the Music: Bridging the Semantic Gap in Music Recommendation , 2006, SEMWEB.

[42]  Òscar Celma,et al.  Foafing the Music: A Music Recommendation System based on RSS Feeds and User Preferences , 2005, ISMIR.

[43]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[44]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[45]  Russell Greiner,et al.  Does Wikipedia Information Help Netflix Predictions? , 2008, 2008 Seventh International Conference on Machine Learning and Applications.

[46]  Mark Anderson Google Searches for Ad Dollars in Social Networks , 2008, IEEE Spectrum.

[47]  Mark Claypool,et al.  Combining Content-Based and Collaborative Filters in an Online Newspaper , 1999, SIGIR 1999.

[48]  M. Pazzani,et al.  Webert : Identifying interesting web sites , 2022 .

[49]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[50]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[51]  E. Vesterinen,et al.  Affective Computing , 2009, Encyclopedia of Biometrics.

[52]  Peter Brusilovsky,et al.  Open user profiles for adaptive news systems: help or harm? , 2007, WWW '07.

[53]  Yoav Shoham,et al.  Fab: content-based, collaborative recommendation , 1997, CACM.

[54]  Yoav Shoham,et al.  Content-Based, Collaborative Recommendation. , 1997 .

[55]  Alejandro Bellogín,et al.  News@hand: A Semantic Web Approach to Recommending News , 2008, AH.

[56]  Carlo Strapparava,et al.  Improving User Modelling with Content-Based Techniques , 2001, User Modeling.

[57]  C. Lee Giles,et al.  CiteSeer: an autonomous Web agent for automatic retrieval and identification of interesting publications , 1998, AGENTS '98.

[58]  Gerald Salton,et al.  Automatic text processing , 1988 .

[59]  Iraklis Varlamis,et al.  SEWeP: using site semantics and a taxonomy to enhance the Web personalization process , 2003, KDD '03.

[60]  Maurizio Vichi,et al.  Studies in Classification Data Analysis and knowledge Organization , 2011 .

[61]  Dunja Mladenic,et al.  Machine learning used by Personal WebWatcher , 1999 .

[62]  Pasquale Lops,et al.  User Profiles for Personalizing Digital Libraries , 2009, Handbook of Research on Digital Libraries.

[63]  Bernardo A. Huberman,et al.  The Structure of Collaborative Tagging Systems , 2005, ArXiv.

[64]  Steve Cayzer,et al.  Learning User Profiles from Tagging Data and Leveraging them for Personal(ized) Information Access , 2007, WWW 2007.

[65]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[66]  Thorsten Joachims,et al.  Web Watcher: A Tour Guide for the World Wide Web , 1997, IJCAI.

[67]  Katia P. Sycara,et al.  WebMate: a personal agent for browsing and searching , 1998, AGENTS '98.

[68]  Ian H. Witten,et al.  The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.

[69]  Magnus Sahlgren,et al.  The Word-Space Model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces , 2006 .

[70]  Panagiotis Symeonidis,et al.  Content-based Dimensionality Reduction for Recommender Systems , 2007, GfKl.

[71]  Federica Cena,et al.  Towards a Tag-Based User Model: How Can User Model Benefit from Tags? , 2007, User Modeling.

[72]  Ivan Koychev,et al.  Learning User Interests through Positive Examples Using Content Analysis and Collaborative Filtering , 2001 .

[73]  Robert C. Holte,et al.  Inferring What a User Is Not Interested in , 1996, AI.

[74]  Philip D. Straffin,et al.  Topics in the theory of voting , 1980 .

[75]  Robin D. Burke,et al.  Hybrid Recommender Systems: Survey and Experiments , 2002, User Modeling and User-Adapted Interaction.

[76]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[77]  Lior Rokach,et al.  Data Mining with Decision Trees - Theory and Applications , 2007, Series in Machine Perception and Artificial Intelligence.

[78]  Sandip Sen,et al.  MOVIES2GO: an online voting based movie recommender system , 2001, AGENTS '01.

[79]  Humphrey Sorensen,et al.  PSUN: A Profiling System for Usenet News , 1995, CIKM Information Agents Workshop.

[80]  Stuart E. Middleton,et al.  Ontological user profiling in recommender systems , 2004, TOIS.

[81]  Allan Collins,et al.  A spreading-activation theory of semantic processing , 1975 .

[82]  Martin Szomszor,et al.  Enriching Ontological User Profiles with Tagging History for Multi-Domain Recommendations , 2008 .

[83]  Sean M. McNee,et al.  Being accurate is not enough: how accuracy metrics have hurt recommender systems , 2006, CHI Extended Abstracts.

[84]  Evgeniy Gabrilovich,et al.  Concept-Based Feature Generation and Selection for Information Retrieval , 2008, AAAI.

[85]  Sean M. McNee,et al.  Accurate is not always good: How Accuracy Metrics have hurt Recommender Systems , 2006 .

[86]  Panagiotis Symeonidis,et al.  Tag recommendations based on tensor dimensionality reduction , 2008, RecSys '08.

[87]  Jonathan L. Herlocker,et al.  Evaluating collaborative filtering recommender systems , 2004, TOIS.

[88]  Luc Steels,et al.  Integrating Collaborative Tagging and Emergent Semantics for Image Retrieval , 2006 .

[89]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[90]  Anna Lisa Gentile,et al.  UNIBA: JIGSAW algorithm for Word Sense Disambiguation , 2007, SemEval@ACL.

[91]  José Juan Pazos-Arias,et al.  Providing entertainment by content-based filtering and semantic reasoning in intelligent recommender systems , 2008, IEEE Transactions on Consumer Electronics.

[92]  Gail A. Herndon The chronicle of higher education , 1977 .

[93]  Josep Lluís de la Rosa i Esteva,et al.  A Taxonomy of Recommender Agents on the Internet , 2003, Artificial Intelligence Review.

[94]  Yi Zhang,et al.  Novelty and redundancy detection in adaptive filtering , 2002, SIGIR '02.

[95]  J. Giles Internet encyclopaedias go head to head , 2005, Nature.

[96]  Michael J. Pazzani,et al.  Content-Based Recommendation Systems , 2007, The Adaptive Web.

[97]  Evgeniy Gabrilovich,et al.  Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge , 2006, AAAI.

[98]  Lars Schmidt-Thieme,et al.  Data Analysis, Machine Learning and Applications - Proceedings of the 31st Annual Conference of the Gesellschaft für Klassifikation e.V., Albert-Ludwigs-Universität Freiburg, March 7-9, 2007 , 2008, GfKl.

[99]  Elaine Rich,et al.  User Modeling via Stereotypes , 1998, Cogn. Sci..

[100]  Lars Schmidt-Thieme,et al.  Tag-aware recommender systems by fusion of collaborative filtering algorithms , 2008, SAC '08.

[101]  Barry Smyth,et al.  A personalised TV listings service for the digital TV age , 2000, Knowl. Based Syst..

[102]  Rada Mihalcea,et al.  Linking Documents to Encyclopedic Knowledge , 2008, IEEE Intelligent Systems.

[103]  Anna Lisa Gentile,et al.  The JUMP project: Domain Ontologies and Linguistic Knowledge @ Work , 2007, SWAP.

[104]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[105]  Raymond J. Mooney,et al.  Content-boosted collaborative filtering for improved recommendations , 2002, AAAI/IAAI.

[106]  Alexander V. Smirnov,et al.  Information filtering based on wiki index database , 2008, ArXiv.

[107]  Anna Lisa Gentile,et al.  An electronic performance support system based on a hybridcContent-collaborative recommender system , 2007 .