Forecasting the pulse: How deviations from regular patterns in online data can identify offline phenomena

Purpose – The steady increase of data on human behavior collected online holds significant research potential for social scientists. The purpose of this paper is to add a systematic discussion of different online services, their data generating processes, the offline phenomena connected to these data, and by demonstrating, in a proof of concept, a new approach for the detection of extraordinary offline phenomena by the analysis of online data. Design/methodology/approach – To detect traces of extraordinary offline phenomena in online data, the paper determines the normal state of the respective communication environment by measuring the regular dynamics of specific variables in data documenting user behavior online. In its proof of concept, the paper does so by concentrating on the diversity of hashtags used on Twitter during a given time span. The paper then uses the seasonal trend decomposition procedure based on loess (STL) to determine large deviations between the state of the system as forecasted by ...

[1]  Eni Mustafaraj,et al.  From Obscurity to Prominence in Minutes: Political Speech and Real-Time Search , 2010 .

[2]  Panagiotis Takis Metaxas,et al.  Vocal Minority Versus Silent Majority: Discovering the Opionions of the Long Tail , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[3]  Eni Mustafaraj,et al.  On the predictability of the U.S. elections through search volume activity , 2011 .

[4]  Bernardo A. Huberman,et al.  What Trends in Chinese Social Media , 2011, ArXiv.

[5]  Sune Lehmann,et al.  Understanding the Demographics of Twitter Users , 2011, ICWSM.

[6]  Richard A. Rogers,et al.  The End of the Virtual , 2009 .

[7]  Andreas Jungherr,et al.  Stuttgart’s Black Thursday on Twitter : Mapping Political Protests with Social Media Data , 2014 .

[8]  Leysia Palen,et al.  Natural Language Processing to the Rescue? Extracting "Situational Awareness" Tweets During Mass Emergency , 2011, ICWSM.

[9]  E. Noelle-Neumann The Theory of Public Opinion: The Concept of the Spiral of Silence , 1991 .

[10]  Krishna P. Gummadi,et al.  Media Landscape in Twitter: A World of New Conventions and Political Diversity , 2011, ICWSM.

[11]  Daniel G. McDonald,et al.  The Conceptualization and Measurement of Diversity , 2003, Commun. Res..

[12]  Krishna P. Gummadi,et al.  Measuring User Influence in Twitter: The Million Follower Fallacy , 2010, ICWSM.

[13]  JungherrAndreas,et al.  Why the Pirate Party Won the German Election of 2009 or The Trouble With Predictions , 2012 .

[14]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[15]  Venkata Rama Kiran Garimella,et al.  Mining web query logs to analyze political issues , 2012, WebSci '12.

[16]  Munmun De Choudhury,et al.  Can blog communication dynamics be correlated with stock market activity? , 2008, Hypertext.

[17]  Daniel Gayo-Avello,et al.  "I Wanted to Predict Elections with Twitter and all I got was this Lousy Paper" - A Balanced Survey on Election Prediction using Twitter Data , 2012, ArXiv.

[18]  Susan Davis : Media Events: The Live Broadcasting of History , 1993 .

[19]  Johan Bollen,et al.  Twitter mood predicts the stock market , 2010, J. Comput. Sci..

[20]  Jon M. Kleinberg,et al.  Bursty and Hierarchical Structure in Streams , 2002, Data Mining and Knowledge Discovery.

[21]  Alexander M. Millkey The Black Swan: The Impact of the Highly Improbable , 2009 .

[22]  Gilad Mishne,et al.  Why Are They Excited? Identifying and Explaining Spikes in Blog Mood Levels , 2006, EACL.

[23]  H. Varian,et al.  Predicting the Present with Google Trends , 2009 .

[24]  Panagiotis Takis Metaxas,et al.  The power of prediction with social media , 2013, Internet Res..

[25]  Gilad Mishne,et al.  Predicting Movie Sales from Blogger Sentiment , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[26]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[27]  Hosung Park,et al.  What is Twitter, a social network or a news media? , 2010, WWW '10.

[28]  A. Brenner Twitter Use 2012 , 2012 .

[29]  Panagiotis Takis Metaxas,et al.  How (Not) to Predict Elections , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[30]  Scott A. Golder,et al.  Diurnal and Seasonal Mood Vary with Work, Sleep, and Daylength Across Diverse Cultures , 2011 .

[31]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[32]  Ramanathan V. Guha,et al.  The predictive power of online chatter , 2005, KDD '05.

[33]  Deepayan Chakrabarti,et al.  Event Summarization Using Tweets , 2011, ICWSM.

[34]  Moez Ltifi Roles of social media in the retail sector in Tunisia: the case of Facebook , 2014 .

[35]  Fang Wu,et al.  Social Networks that Matter: Twitter Under the Microscope , 2008, First Monday.

[36]  Irma J. Terpenning,et al.  STL : A Seasonal-Trend Decomposition Procedure Based on Loess , 1990 .

[37]  Stanislav Nikolov Trend or no trend : a novel nonparametric method for classifying time series , 2012 .

[38]  Daniel Gayo-Avello,et al.  Don't turn social media into another 'Literary Digest' poll , 2011, Commun. ACM.

[39]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, Web Intelligence.

[40]  Albert-László Barabási,et al.  Understanding individual human mobility patterns , 2008, Nature.

[41]  David A. Shamma,et al.  Peaks and persistence: modeling the shape of microblog conversations , 2011, CSCW '11.

[42]  Teresa Correa,et al.  Who interacts on the Web?: The intersection of users' personality and social media use , 2010, Comput. Hum. Behav..

[43]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[44]  P. Metaxas,et al.  Social Media and the Elections , 2012, Science.

[45]  G. N. Gilbert Computational Social Science , 2010 .

[46]  Gilad Mishne,et al.  Capturing Global Mood Levels using Blog Posts , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.