Advances in Knowledge Discovery and Data Mining

Similarity in people to people (P2P) recommendation in social networks is not symmetric, where both entities of a relationship are involved in the reciprocal process of determining the success of the relationship. The widely used memory-based collaborative filtering (CF) has advantages of effectiveness and efficiency in traditional item to people recommendation. However, the critical step of computation of similarity between the subjects or objects of recommendation in memory-based CF is typically based on a heuristically symmetric relationship, which may be flawed in P2P recommendation. In this paper, we show that memory-based CF can be significantly improved by using a novel asymmetric model of similarity that considers the probabilities of both positive and negative behaviours, for example, in accepting or rejecting a recommended relationship. We present also a unified model of the fundamental principles of collaborative recommender systems that subsumes both user-based and item-based CF. Our experiments evaluate the proposed approach in P2P recommendation in the real world online dating application, showing significantly improved performance over traditional memory-based methods.

[1]  Wei Song,et al.  Bridging Topic Modeling and Personalized Search , 2010, COLING.

[2]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[3]  Hsin-Min Wang,et al.  Query by multi-tags with multi-level preferences for content-based music retrieval , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[4]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[6]  James Allan,et al.  A Comparative Study of Utilizing Topic Models for Information Retrieval , 2009, ECIR.

[7]  David Buttler,et al.  Latent topic feedback for information retrieval , 2011, KDD.

[8]  A. McCallum,et al.  Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[9]  Mark Sandler,et al.  Learning Latent Semantic Models for Music from Social Tags , 2008 .

[10]  Hugh E. Williams,et al.  Query association for effective retrieval , 2002, CIKM '02.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Hanna M. Wallach,et al.  Topic modeling: beyond bag-of-words , 2006, ICML.

[13]  Padhraic Smyth,et al.  Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model , 2006, NIPS.

[14]  ChengXiang Zhai,et al.  Statistical Language Models for Information Retrieval: A Critical Review , 2008, Found. Trends Inf. Retr..

[15]  Tao Tao,et al.  Language Model Information Retrieval with Document Expansion , 2006, NAACL.

[16]  Hsin-Min Wang,et al.  Exploiting semantic associative information in topic modeling , 2010, 2010 IEEE Spoken Language Technology Workshop.

[17]  Hung-An Chang,et al.  Language model adaptation using latent dirichlet allocation and an efficient topic inference algorithm , 2007, INTERSPEECH.

[18]  David A. Hull Using statistical testing in the evaluation of retrieval experiments , 1993, SIGIR.

[19]  Jen-Tzung Chien,et al.  Adaptive Bayesian Latent Semantic Analysis , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[21]  W. Bruce Croft,et al.  A general language model for information retrieval , 1999, CIKM '99.

[22]  Tom Minka,et al.  Expectation-Propogation for the Generative Aspect Model , 2002, UAI.