Ups and Downs in Buzzes: Life Cycle Modeling for Temporal Pattern Discovery

In social media analysis, one critical task is detecting burst of topics or buzz, which is reflected by extremely frequent mentions of certain key words in a short time interval. Detecting buzz not only provides useful insights into the information propagation mechanism, but also plays an essential role in preventing malicious rumors. However, buzz modeling is a challenging task because a buzz time-series usually exhibits sudden spikes and heavy tails, which fails most existing time-series models. To deal with buzz time-series sequences, we propose a novel time-series modeling approach which captures the rise and fade temporal patterns via Product Life Cycle (PLC) models, a classical concept in economics. More specifically, we propose a mixture of PLC models to capture the multiple peaks in buzz time-series and furthermore develop a probabilistic graphical model (K-MPLC) to automatically discover inherent life cycle patterns within a collection of buzzes. Our experiment results show that our proposed method significantly outperforms existing state-of-the-art approaches on buzzes clustering.

[1]  Ravi Kumar,et al.  Dynamics of conversations , 2010, KDD.

[2]  Kristina Lerman,et al.  Information Contagion: An Empirical Study of the Spread of News on Digg and Twitter Social Networks , 2010, ICWSM.

[3]  Jimeng Sun,et al.  A Survey of Models and Algorithms for Social Influence Analysis , 2011, Social Network Data Analytics.

[4]  Ravi Kumar,et al.  On the Bursty Evolution of Blogspace , 2003, WWW '03.

[5]  Eamonn J. Keogh,et al.  A Complexity-Invariant Distance Measure for Time Series , 2011, SDM.

[6]  Alexandru Isaic-Maniu,et al.  ON A MODEL REGARDING THE PRODUCT LIFE CYCLE , 2008 .

[7]  Ramanathan V. Guha,et al.  The predictive power of online chatter , 2005, KDD '05.

[8]  Masashi Sugiyama,et al.  Super-Linear Convergence of Dual Augmented Lagrangian Algorithm for Sparsity Regularized Estimation , 2009, J. Mach. Learn. Res..

[9]  David Olwell,et al.  Reliability Modeling, Prediction, and Optimization , 2001, Technometrics.

[10]  Eamonn J. Keogh,et al.  A Novel Approximation to Dynamic Time Warping allows Anytime Clustering of Massive Time Series Datasets , 2012, SDM.

[11]  Lei Li,et al.  Time Series Clustering: Complex is Simpler! , 2011, ICML.

[12]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[13]  Gözde Özbal,et al.  Exploring Text Virality in Social Networks , 2011, ICWSM.

[14]  Shan Jiang,et al.  Clustering daily patterns of human activities in the city , 2012, Data Mining and Knowledge Discovery.

[15]  Eamonn J. Keogh,et al.  Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring Some Data , 2011, 2011 IEEE 11th International Conference on Data Mining.

[16]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[17]  T. Levitt EXPLOIT THE PRODUCT LIFE CYCLE , 1965 .

[18]  Christos Faloutsos,et al.  Rise and fall patterns of information diffusion: model and implications , 2012, KDD.

[19]  Jure Leskovec,et al.  Patterns of temporal variation in online media , 2011, WSDM '11.

[20]  Liangjie Hong,et al.  A time-dependent topic model for multiple text streams , 2011, KDD.

[21]  Chao Liu,et al.  A probabilistic approach to spatiotemporal theme pattern mining on weblogs , 2006, WWW '06.

[22]  Dimitrios Gunopulos,et al.  Iterative Incremental Clustering of Time Series , 2004, EDBT.

[23]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[24]  H. Varian,et al.  Predicting the Present with Google Trends , 2009 .

[25]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[26]  Yan Liu,et al.  What is Tumblr: a statistical overview and comparison , 2014, SKDD.