User modeling and usage profiling based on temporal posting behavior in OSNs

Abstract In this paper, we study the posting behavior of online social network (OSN) users, in particular the posting frequency and temporal patterns, and consider possible interpretations of how users use the platform. At the aggregate (macro) level, we find two distinct peaks of traffic, one during morning working hours, and one in the evening. The morning peak is more pronounced for frequent posters, while the evening peak is pronounced for the remaining users. We postulate that this difference results from different usage purposes of the OSN platform (e.g. for work, with customers, etc.) than purely social interactions (e.g., friends, family, etc.). We also study user posting behavior at an individual (micro) level and model the user posting sequences as generated by a Hidden Markov Model. We compare the results of using a simple zeroth order model (which is equivalent to a topic model such as LDA), and a first-order model, in terms of their effectiveness in clustering and predicting user types, and show the advantage gained by the first-order HMM. Overall, our study provides new insights into user activity in today’s OSNs, and suggests a framework for profiling users based on their posting activities. We believe our approach will complement other methods of user profiling based on static demographic information and friendship network information.

[1]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[2]  Carl E. Rasmussen,et al.  Clustering Protein Sequence and Structure Space with Infinite Gaussian Mixture Models , 2003, Pacific Symposium on Biocomputing.

[3]  Padhraic Smyth,et al.  Visualization of navigation patterns on a Web site using model-based clustering , 2000, KDD '00.

[4]  Jun Sun,et al.  Topic Modeling for Sequences of Temporal Activities , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[5]  Carl E. Rasmussen,et al.  Factorial Hidden Markov Models , 1997 .

[6]  Minoru Etoh,et al.  Topic Analysis of Web User Behavior Using LDA Model on Proxy Logs , 2011, PAKDD.

[7]  Songqing Chen,et al.  Analyzing patterns of user content generation in online social networks , 2009, KDD.

[8]  Mariacarla Calzarossa,et al.  Modeling and predicting temporal patterns of web content changes , 2015, J. Netw. Comput. Appl..

[9]  Christos Faloutsos,et al.  RSC: Mining and Modeling Temporal Activity in Social Media , 2015, KDD.

[10]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[11]  Virgílio A. F. Almeida,et al.  Characterizing user behavior in online social networks , 2009, IMC '09.

[12]  Le Song,et al.  Dirichlet-Hawkes Processes with Applications to Clustering Continuous-Time Document Streams , 2015, KDD.

[13]  Eugene Agichtein,et al.  TM-LDA: efficient online modeling of latent topic transitions in social media , 2012, KDD.

[14]  Yee Whye Teh,et al.  Bayesian Nonparametric Models , 2010, Encyclopedia of Machine Learning.

[15]  Michael I. Jordan,et al.  Nonparametric Bayesian Learning of Switching Linear Dynamical Systems , 2008, NIPS.

[16]  Jure Leskovec,et al.  Patterns of temporal variation in online media , 2011, WSDM '11.

[17]  Susan T. Dumais,et al.  Modeling and predicting behavioral dynamics on the web , 2012, WWW.

[18]  Michalis Faloutsos,et al.  Facebook wall posts: a model of user behaviors , 2017, Social Network Analysis and Mining.

[19]  Ran El-Yaniv,et al.  On Prediction Using Variable Order Markov Models , 2004, J. Artif. Intell. Res..

[20]  Santanu Chaudhury,et al.  Clustering short temporal behaviour sequences for customer segmentation using LDA , 2018, Expert Syst. J. Knowl. Eng..

[21]  C. Lee Giles,et al.  Probabilistic user behavior models , 2003, Third IEEE International Conference on Data Mining.

[22]  Yang Zhang,et al.  Modeling user posting behavior on social media , 2012, SIGIR '12.