User behavior learning and transfer in composite social networks

Accurate prediction of user behaviors is important for many social media applications, including social marketing, personalization, and recommendation. A major challenge lies in that although many previous works model user behavior from only historical behavior logs, the available user behavior data or interactions between users and items in a given social network are usually very limited and sparse (e.g., ⩾ 99.9% empty), which makes models overfit the rare observations and fail to provide accurate predictions. We observe that many people are members of several social networks in the same time, such as Facebook, Twitter, and Tencent’s QQ. Importantly, users’ behaviors and interests in different networks influence one another. This provides an opportunity to leverage the knowledge of user behaviors in different networks by considering the overlapping users in different networks as bridges, in order to alleviate the data sparsity problem, and enhance the predictive performance of user behavior modeling. Combining different networks “simply and naively” does not work well. In this article, we formulate the problem to model multiple networks as “adaptive composite transfer” and propose a framework called ComSoc. ComSoc first selects the most suitable networks inside a composite social network via a hierarchical Bayesian model, parameterized for individual users. It then builds topic models for user behavior prediction using both the relationships in the selected networks and related behavior data. With different relational regularization, we introduce different implementations, corresponding to different ways to transfer knowledge from composite social relations. To handle big data, we have implemented the algorithm using Map/Reduce. We demonstrate that the proposed composite network-based user behavior models significantly improve the predictive accuracy over a number of existing approaches on several real-world applications, including a very large social networking dataset from Tencent Inc.

[1]  J. Lafferty,et al.  Mixed-membership models of scientific publications , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[2]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[3]  Jiawei Han,et al.  Ranking-based classification of heterogeneous information networks , 2011, KDD.

[4]  Duncan J. Watts,et al.  Six Degrees: The Science of a Connected Age , 2003 .

[5]  Chong Wang,et al.  Collaborative topic modeling for recommending scientific articles , 2011, KDD.

[6]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[7]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[8]  Thomas Hofmann,et al.  Latent semantic models for collaborative filtering , 2004, TOIS.

[9]  Ramesh Nallapati,et al.  Joint latent topic models for text and citations , 2008, KDD.

[10]  Bo Zhao,et al.  Probabilistic topic models with biased propagation on heterogeneous information networks , 2011, KDD.

[11]  Geoffrey E. Hinton,et al.  Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[12]  Alex Pentland,et al.  Composite Social Network for Predicting Mobile Apps Installation , 2011, AAAI.

[13]  David M. Blei,et al.  Relational Topic Models for Document Networks , 2009, AISTATS.

[14]  Weiguo Fan,et al.  TransRank: A Novel Algorithm for Transfer of Rank Learning , 2008, 2008 IEEE International Conference on Data Mining Workshops.

[15]  Jiawei Han Mining Heterogeneous Information Networks by Exploring the Power of Links , 2009, Discovery Science.

[16]  Philip S. Yu,et al.  Integrating meta-path selection with user-guided object clustering in heterogeneous information networks , 2012, KDD.

[17]  Deepak S. Turaga,et al.  Cross domain distribution adaptation via kernel mapping , 2009, KDD.

[18]  Nam P. Nguyen,et al.  Overlapping communities in dynamic networks: their detection and mobile applications , 2011, MobiCom.

[19]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[20]  Junwei Wang,et al.  ComSoc: adaptive transfer of user behaviors over composite social network , 2012, KDD.

[21]  Jennifer Neville,et al.  Modeling relationship strength in online social networks , 2010, WWW '10.

[22]  Jing Peng,et al.  Universal Learning over Related Distributions and Adaptive Graph Transduction , 2009, ECML/PKDD.

[23]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[24]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[25]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Clustering via the SocialWeb , 2009, ACL.

[26]  Hao Wang,et al.  Analysis of Large Multi-modal Social Networks: Patterns and a Generator , 2010, ECML/PKDD.

[27]  Yehuda Koren,et al.  Factorization meets the neighborhood: a multifaceted collaborative filtering model , 2008, KDD.

[28]  Eric Horvitz,et al.  Collaborative Filtering by Personality Diagnosis: A Hybrid Memory and Model-Based Approach , 2000, UAI.

[29]  Qiang Yang,et al.  Cross Validation Framework to Choose amongst Models and Datasets for Transfer Learning , 2010, ECML/PKDD.

[30]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[31]  Chao Liu,et al.  Recommender systems with social regularization , 2011, WSDM '11.

[32]  Wei Chen,et al.  Efficient influence maximization in social networks , 2009, KDD.

[33]  Jure Leskovec,et al.  Empirical comparison of algorithms for network community detection , 2010, WWW '10.

[34]  Éva Tardos,et al.  Influential Nodes in a Diffusion Model for Social Networks , 2005, ICALP.

[35]  Srikanta J. Bedathur,et al.  Towards time-aware link prediction in evolving social networks , 2009, SNA-KDD '09.

[36]  Max Welling,et al.  Fast collapsed gibbs sampling for latent dirichlet allocation , 2008, KDD.

[37]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[38]  Qiang Yang,et al.  Transfer Learning in Collaborative Filtering for Sparsity Reduction , 2010, AAAI.

[39]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[40]  Jure Leskovec,et al.  Statistical properties of community structure in large social and information networks , 2008, WWW.

[41]  Andrew McCallum,et al.  Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression , 2008, UAI.

[42]  Alexander J. Smola,et al.  An architecture for parallel topic models , 2010, Proc. VLDB Endow..

[43]  Jure Leskovec,et al.  Planetary-scale views on a large instant-messaging network , 2008, WWW.

[44]  Thomas L. Griffiths,et al.  Probabilistic author-topic models for information discovery , 2004, KDD.

[45]  Dit-Yan Yeung,et al.  Transfer metric learning by learning task relationships , 2010, KDD.

[46]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.