Effectively Predicting Whether and When a Topic Will Become Prevalent in a Social Network

Effective forecasting of future prevalent topics plays an important role in social network business development. It involves two challenging aspects: predicting whether a topic will become prevalent, and when. This cannot be directly handled by the existing algorithms in topic modeling, item recommendation and action forecasting. The classic forecasting framework based on time series models may be able to predict a hot topic when a series of periodical changes to user-addressed frequency in a systematic way. However, the frequency of topics discussed by users often changes irregularly in social networks. In this paper, a generic probabilistic framework is proposed for hot topic prediction, and machine learning methods are explored to predict hot topic patterns. Two effective models, PreWHether and PreWHen, are introduced to predict whether and when a topic will become prevalent. In the PreWHether model, we simulate the constructed features of previously observed frequency changes for better prediction. In the PreWHen model, distributions of time intervals associated with the emergence to prevalence of a topic are modeled. Extensive experiments on real datasets demonstrate that our method outperforms the baselines and generates more effective predictions.

[1]  Deepak Agarwal,et al.  Spatio-temporal models for estimating click-through rate , 2009, WWW '09.

[2]  Masatoshi Yoshikawa,et al.  Scalable Algorithms for Distribution Search , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[3]  D. Brillinger Time series - data analysis and theory , 1981, Classics in applied mathematics.

[4]  Michael A. West,et al.  Time Series: Modeling, Computation, and Inference , 2010 .

[5]  Jonathan D. Cryer,et al.  Time Series Analysis , 1986 .

[6]  Philip S. Yu,et al.  Optimal multi-scale patterns in time series streams , 2006, SIGMOD Conference.

[7]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.

[8]  James D. Hamilton Time Series Analysis , 1994 .

[9]  Jimeng Sun,et al.  Social action tracking via noise tolerant time-varying factor graphs , 2010, KDD.

[10]  Richard A. Davis,et al.  Introduction to time series and forecasting , 1998 .

[11]  Noriaki Kawamae,et al.  Trend analysis model: trend consists of temporal words, topics, and timestamps , 2011, WSDM '11.

[12]  Dimitrios Gunopulos,et al.  Online amnesic approximation of streaming time series , 2004, Proceedings. 20th International Conference on Data Engineering.

[13]  Christos Faloutsos,et al.  Stream Monitoring under the Time Warping Distance , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[14]  Christos Faloutsos,et al.  Fast mining and forecasting of complex time-stamped events , 2012, KDD.

[15]  Neil Gershenfeld,et al.  The nature of mathematical modeling , 1998 .

[16]  Edwin Hewitt,et al.  Real and Abstract Analysis: A Modern Treatment of the Theory of Functions of a Real Variable , 1965 .

[17]  Christos Faloutsos,et al.  Adaptive, unsupervised stream mining , 2004, The VLDB Journal.

[18]  Gerhard Friedrich,et al.  Recommender Systems - An Introduction , 2010 .

[19]  Edwin Hewitt,et al.  Real And Abstract Analysis , 1967 .

[20]  Jun Zhu,et al.  User grouping behavior in online forums , 2009, KDD.

[21]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[22]  John Riedl,et al.  Recommender systems: from algorithms to user experience , 2012, User Modeling and User-Adapted Interaction.

[23]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[24]  Yixin Chen,et al.  Multi-Dimensional Regression Analysis of Time-Series Data Streams , 2002, VLDB.

[25]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.