TWITOBI: A Recommendation System for Twitter Using Probabilistic Modeling

Twitter provides search services to help people find new users to follow by recommending popular users or their friends' friends. However, these services do not offer the most relevant users to follow for a user. Furthermore, Twitter does not provide yet the search services to find the most interesting tweet messages for a user either. In this paper, we propose TWITOBI, a recommendation system for Twitter using probabilistic modeling for collaborative filtering which can recommend top-K users to follow and top-K tweets to read for a user. Our novel probabilistic model utilizes not only tweet messages but also the relationships between users. We develop an estimation algorithm for learning our model parameters and present its parallelized algorithm using MapReduce to handle large data. Our performance study with real-life data sets confirms the effectiveness and scalability of our algorithms.

[1]  Xu Ling,et al.  Topic sentiment mixture: modeling facets and opinions in weblogs , 2007, WWW '07.

[2]  Mukkai S. Krishnamoorthy,et al.  A random walk method for alleviating the sparsity problem in collaborative filtering , 2008, RecSys '08.

[3]  John Hannon,et al.  Recommending twitter users to follow using content and collaborative filtering approaches , 2010, RecSys '10.

[4]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[5]  David M. Blei,et al.  Relational Topic Models for Document Networks , 2009, AISTATS.

[6]  Sepandar D. Kamvar,et al.  An Analytical Comparison of Approaches to Personalizing PageRank , 2003 .

[7]  Ee-Peng Lim,et al.  On ranking controversies in wikipedia: models and evaluation , 2008, WSDM '08.

[8]  Wei Li,et al.  Pachinko allocation: DAG-structured mixture models of topic correlations , 2006, ICML.

[9]  George Karypis,et al.  Item-based top-N recommendation algorithms , 2004, TOIS.

[10]  Guy Shani,et al.  An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..

[11]  Audra E. Kosh,et al.  Linear Algebra and its Applications , 1992 .

[12]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[13]  Franco Scarselli,et al.  Inside PageRank , 2005, TOIT.

[14]  Luo Si,et al.  Flexible Mixture Model for Collaborative Filtering , 2003, ICML.

[15]  Michael J. Pazzani,et al.  Learning and Revising User Profiles: The Identification of Interesting Web Sites , 1997, Machine Learning.

[16]  Michael R. Lyu,et al.  A generalized Co-HITS algorithm and its application to bipartite graphs , 2009, KDD.

[17]  Abhinandan Das,et al.  Google news personalization: scalable online collaborative filtering , 2007, WWW '07.

[18]  Thomas Hofmann,et al.  Latent semantic models for collaborative filtering , 2004, TOIS.

[19]  Michael S. Bernstein,et al.  Short and tweet: experiments on recommending content from information streams , 2010, CHI.

[20]  Yan Liu,et al.  Topic-link LDA: joint models of topic and author community , 2009, ICML '09.

[21]  Thomas Hofmann,et al.  Robust collaborative filtering , 2007, RecSys '07.

[22]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[23]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[24]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[25]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..