Evaluating the impact of social-media on sales forecasting: a quantitative study of world's biggest brands using Twitter, Facebook and Google Trends

In the world of digital communication, data from online sources such as social networks might provide additional information about changing consumer interest and significantly improve the accuracy of forecasting models. In this thesis I investigate whether information from Twitter, Facebook and Google Trends have the ability to improve daily sales forecasts for companies with respect to the forecasts from transactional sales data only. My original contribution to this domain, exposed in the present thesis, consists in the following main steps: 1. Data collection. I collected Twitter, Facebook and Google Trends data for the period May 2013 May 2015 for 75 brands. Historical transactional sales data was supplied by Certona Corporation. 2. Sentiment analysis. I introduced a new sentiment classification approach based on combining the two standard techniques (lexicon-based and machine learning based). The proposed method outperforms the state-of-the-art approach by 7% in F-score. 3. Identification and classification of events. I proposed a framework for events detection and a robust method for clustering Twitter events into different types based on the shape of the Twitter volume and sentiment peaks. This approach allows to capture the varying dynamics of information propagation through the social network. I provide empirical evidence that it is possible to identify types of Twitter events that have significant power to predict spikes in sales. 4. Forecasting next day sales. I explored linear, non-linear and cointegrating relationships between sales and social-media variables for 18 brands and showed that social-media variables can improve daily sales forecasts for the majority of brands by capturing factors, such as consumer sentiment and brand perception. Moreover, I identified that social-media data without sales information, can be used to predict sales direction with the accuracy of 63%. The experts from the industry consider the results obtained in this thesis to be valuable and useful for decision making and for making strategic planning for the future.

[1]  Huan Liu,et al.  Exploiting social relations for sentiment analysis in microblogging , 2013, WSDM.

[2]  F. Hampel A General Qualitative Definition of Robustness , 1971 .

[3]  Axel Schulz,et al.  Small-Scale Incident Detection based on Microposts , 2015, HT.

[4]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[5]  Isaac G. Councill,et al.  What's great and what's not: learning to classify the scope of negation for improved sentiment analysis , 2010, NeSp-NLP@ACL.

[6]  C. Granger Some properties of time series data and their use in econometric model specification , 1981 .

[7]  Özden Gür Ali,et al.  Multi-period-ahead forecasting with residual extrapolation and information sharing — Utilizing a multitude of retail series , 2016 .

[8]  Deepak Khazanchi,et al.  An Empirical Study of Online Word of Mouth as a Predictor for Multi-product Category e-Commerce Sales , 2008, Electron. Mark..

[9]  Ke Xu,et al.  MoodLens: an emoticon-based sentiment analysis system for chinese tweets , 2012, KDD.

[10]  Thomas Nowotny,et al.  Machine Learning for Automatic Prediction of the Quality of Electrophysiological Recordings , 2013, PloS one.

[11]  Byungtae Lee,et al.  Thumbs Up, Sales Up? The Contingent Effect of Facebook Likes on Sales Performance in Social Commerce , 2015, J. Manag. Inf. Syst..

[12]  Guoqiang Peter Zhang,et al.  Neural Networks for Retail Sales Forecasting , 2005, Encyclopedia of Information Science and Technology.

[13]  Yong Yu,et al.  Fashion retail forecasting by evolutionary neural networks , 2008 .

[14]  Ravikiran Vatrapu,et al.  Predicting iPhone Sales from iPhone Tweets , 2014, 2014 IEEE 18th International Enterprise Distributed Object Computing Conference.

[15]  P. K. Kannan,et al.  Using online search data to forecast new product sales , 2012, Decis. Support Syst..

[16]  J. Bughin Google searches and twitter mood: nowcasting telecom sales performance , 2015 .

[17]  Stan Matwin,et al.  A learner-independent evaluation of the usefulness of statistical phrases for automated text categorization , 2001 .

[18]  Maite Taboada,et al.  Lexicon-Based Methods for Sentiment Analysis , 2011, CL.

[19]  Jianxin Li,et al.  An Efficient Approach to Event Detection and Forecasting in Dynamic Multivariate Social Media Networks , 2017, WWW.

[20]  Konstantinos Kalpakis,et al.  Distance measures for effective clustering of ARIMA time-series , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[21]  Li Wei,et al.  Fast time series classification using numerosity reduction , 2006, ICML.

[22]  Jianxin Li,et al.  Bursty event detection from microblog: a distributed and incremental approach , 2016, Concurr. Comput. Pract. Exp..

[23]  Eric W. T. Ngai,et al.  Social media models, technologies, and applications: An academic review and case study , 2015, Ind. Manag. Data Syst..

[24]  Jerold B. Warner,et al.  Using daily stock returns: The case of event studies , 1985 .

[25]  T. Vogelsang Unit Roots, Cointegration, and Structural Change , 2001 .

[26]  J. Iqbal Forecasting Accuracy of Error Correction Models: International Evidence for Monetary Aggregate M2 , 2013 .

[27]  Eric Horvitz,et al.  Eyewitness: identifying local events via space-time signals in twitter feeds , 2015, SIGSPATIAL/GIS.

[28]  A. Smeaton,et al.  On Using Twitter to Monitor Political Sentiment and Predict Election Results , 2011 .

[29]  Qiang Yang,et al.  Discovering Spammers in Social Networks , 2012, AAAI.

[30]  Adam Kowalczyk,et al.  Second Order Features for Maximising Text Classification Performance , 2001, ECML.

[31]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  John T. Mentzer,et al.  Sales Forecasting Management , 2005 .

[33]  Satoshi Morinaga,et al.  Mining product reputations on the Web , 2002, KDD.

[34]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1971 .

[35]  Soo-Min Kim,et al.  Determining the Sentiment of Opinions , 2004, COLING.

[36]  Eamonn J. Keogh,et al.  Exact indexing of dynamic time warping , 2002, Knowledge and Information Systems.

[37]  Tomaso Aste,et al.  A nonlinear impact: evidences of causal effects of social media on market prices , 2016, ArXiv.

[38]  Xiaohui Yu,et al.  S-PLASA+: adaptive sentiment analysis with application to sales performance prediction , 2010, SIGIR '10.

[39]  Xiaozhe Wang,et al.  Characteristic-Based Clustering for Time Series Data , 2006, Data Mining and Knowledge Discovery.

[40]  Dina Mayzlin,et al.  Promotional Chat on the Internet , 2006 .

[41]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[42]  S. Piantadosi Zipf’s word frequency law in natural language: A critical review and future directions , 2014, Psychonomic Bulletin & Review.

[43]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[44]  G. Yule Why do we Sometimes get Nonsense-Correlations between Time-Series?--A Study in Sampling and the Nature of Time-Series , 1926 .

[45]  Laurie Davies,et al.  The identification of multiple outliers , 1993 .

[46]  Leysia Palen,et al.  Microblogging during two natural hazards events: what twitter may contribute to situational awareness , 2010, CHI.

[47]  T. Murata,et al.  Breaking News Detection and Tracking in Twitter , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[48]  David E. Johnson,et al.  Maximizing Text-Mining Performance , 1999 .

[49]  Gjorgji Madjarov,et al.  Twitter Sentiment Analysis Using Deep Convolutional Neural Network , 2015, HAIS.

[50]  Xiao Ma,et al.  Impact of Prior Reviews on the Subsequent Review Process in Reputation Systems , 2013, J. Manag. Inf. Syst..

[51]  Durga Toshniwal,et al.  Using Cumulative Weighted Slopes for Clustering Time Series Data , 2005 .

[52]  Jie Yin,et al.  Using Social Media to Enhance Emergency Situation Awareness , 2012, IEEE Intelligent Systems.

[53]  Preslav Nakov,et al.  SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[54]  L. Ladha,et al.  FEATURE SELECTION METHODS AND ALGORITHMS , 2011 .

[55]  Huan Wang,et al.  An event detection method for social networks based on link prediction , 2017, Inf. Syst..

[56]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[57]  Wei Jiang,et al.  On-line outlier detection and data cleaning , 2004, Comput. Chem. Eng..

[58]  Isabell M. Welpe,et al.  Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment , 2010, ICWSM.

[59]  Paul Turner,et al.  Aggregate advertising, sales volume and relative prices in the long run , 2000 .

[60]  Bo Pang,et al.  Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales , 2005, ACL.

[61]  Vit Niennattrakul,et al.  Inaccuracies of Shape Averaging Method Using Dynamic Time Warping for Time Series Data , 2007, International Conference on Computational Science.

[62]  Yong Liu Word-of-Mouth for Movies: Its Dynamics and Impact on Box Office Revenue , 2006 .

[63]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[64]  Craig MacDonald,et al.  Can Twitter Replace Newswire for Breaking News? , 2013, ICWSM.

[65]  Janyce Wiebe,et al.  Learning to Disambiguate Potentially Subjective Expressions , 2002, CoNLL.

[66]  Jure Leskovec,et al.  Patterns of temporal variation in online media , 2011, WSDM '11.

[67]  Soo-Min Kim,et al.  Crystal: Analyzing Predictive Opinions on the Web , 2007, EMNLP.

[68]  Julie Beth Lovins,et al.  Development of a stemming algorithm , 1968, Mech. Transl. Comput. Linguistics.

[69]  Philip C. Treleaven,et al.  A framework for Twitter events detection, differentiation and its application for retail brands , 2016, 2016 Future Technologies Conference (FTC).

[70]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[71]  Kristina Lerman,et al.  Tripartite graph clustering for dynamic sentiment analysis on social media , 2014, SIGMOD Conference.

[72]  Vivek Narayanan,et al.  Fast and Accurate Sentiment Classification Using an Enhanced Naive Bayes Model , 2013, IDEAL.

[73]  Stephen R. Marsland,et al.  Machine Learning - An Algorithmic Perspective , 2009, Chapman and Hall / CRC machine learning and pattern recognition series.

[74]  Brendan T. O'Connor,et al.  Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments , 2010, ACL.

[75]  Philipp Cimiano,et al.  Event-based classification of social media streams , 2012, ICMR.

[76]  Eamonn J. Keogh,et al.  A Probabilistic Approach to Fast Pattern Matching in Time Series Databases , 1997, KDD.

[77]  Jasmina Berbegal-Mirabent,et al.  Antecedents of online purchasing behaviour in the tourism sector , 2016, Ind. Manag. Data Syst..

[78]  Tung-lung Steven Chang,et al.  The Product and Timing Effects of eWOM in Viral Marketing , 2016 .

[79]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[80]  Ramanathan V. Guha,et al.  The predictive power of online chatter , 2005, KDD '05.

[81]  Eamonn J. Keogh,et al.  Three Myths about Dynamic Time Warping Data Mining , 2005, SDM.

[82]  A. A. Weiss,et al.  Time series analysis of error-correction models , 2001 .

[83]  C. Granger,et al.  Co-integration and error correction: representation, estimation and testing , 1987 .

[84]  M. Shcherbakov,et al.  A Survey of Forecast Error Measures , 2013 .

[85]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[86]  J. Campbell,et al.  Event Studies in Economics and Finance , 1997 .

[87]  Connie St Louis,et al.  Can Twitter predict disease outbreaks? , 2012, BMJ : British Medical Journal.

[88]  Bruno Ohana,et al.  Sentiment Classification of Reviews Using SentiWordNet , 2009 .

[89]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[90]  Aristides Gionis,et al.  Event detection in activity networks , 2014, KDD.

[91]  Hamido Fujita,et al.  A Hybrid Approach to Sentiment Analysis with Benchmarking Results , 2016, IEA/AIE.

[92]  Guido Caldarelli,et al.  S 1 Appendix , 2016 .

[93]  Elisabetta Fersini,et al.  Enhance User-Level Sentiment Analysis on Microblogs with Approval Relations , 2013, AI*IA.

[94]  D. Shepard A two-dimensional interpolation function for irregularly-spaced data , 1968, ACM National Conference.

[95]  Tobias Preis,et al.  Tracking Protests Using Geotagged Flickr Photographs , 2016, PloS one.

[96]  Isabell M. Welpe,et al.  News or Noise? Using Twitter to Identify and Understand Company‐Specific News Flow , 2014 .

[97]  Tobias Preis,et al.  Early Signs of Financial Market Moves Reflected by Google Searches , 2015 .

[98]  Srinivasan Parthasarathy,et al.  A framework for summarizing and analyzing twitter feeds , 2012, KDD.

[99]  Minyi Guo,et al.  Emoticon Smoothed Language Models for Twitter Sentiment Analysis , 2012, AAAI.

[100]  Daniel E. O'Leary,et al.  Event Study Methodologies in Information Systems Research , 2011, Int. J. Account. Inf. Syst..

[101]  Michael J. Cafarella,et al.  Using Social Media to Measure Labor Market Flows , 2014 .

[102]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[103]  Sotiris Kotsiantis,et al.  Text Classification Using Machine Learning Techniques , 2005 .

[104]  Harith Alani,et al.  Semantic Sentiment Analysis of Twitter , 2012, SEMWEB.

[105]  Hsinchun Chen,et al.  Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums , 2008, TOIS.

[106]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[107]  Meng Wang,et al.  Event analysis in social multimedia: a survey , 2016, Frontiers of Computer Science.

[108]  Din J. Wasem,et al.  Mining of Massive Datasets , 2014 .

[109]  Elisabetta Fersini,et al.  Sentiment analysis: Bayesian Ensemble Learning , 2014, Decis. Support Syst..

[110]  Omer F. Rana,et al.  Can We Predict a Riot? Disruptive Event Detection Using Twitter , 2017, ACM Trans. Internet Techn..

[111]  S. Johansen Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Autoregressive Models , 1991 .

[112]  Tung X. Bui,et al.  The Impact of Electronic-Word-of-Mouth on Digital Microproducts: An Empirical Investigation of Amazon Shorts , 2007, ECIS.

[113]  Miles Osborne,et al.  Using paraphrases for improving first story detection in news and Twitter , 2012, HLT-NAACL.

[114]  F. Diebold,et al.  Comparing Predictive Accuracy , 1994, Business Cycles.

[115]  Eric T. Bradlow,et al.  The Role of Big Data and Predictive Analytics in Retailing , 2017 .

[116]  John W. Tukey,et al.  Exploratory Data Analysis. , 1979 .

[117]  Ioana Hulpus,et al.  Event Analysis in Social Media Using Clustering of Heterogeneous Information Networks , 2015, FLAIRS Conference.

[118]  Jörg Kindermann,et al.  Authorship Attribution with Support Vector Machines , 2003, Applied Intelligence.

[119]  Geoffrey R. Norman,et al.  Biostatistics: The Bare Essentials , 1993 .

[120]  S. Johansen STATISTICAL ANALYSIS OF COINTEGRATION VECTORS , 1988 .

[121]  Nathan Kallus,et al.  Predicting crowd behavior with big public data , 2014, WWW.

[122]  Masaru Kitsuregawa,et al.  Building Lexicon for Sentiment Analysis from Massive Collection of HTML Documents , 2007, EMNLP.

[123]  Alexander Mendiburu,et al.  An efficient implementation of kernel density estimation for multi-core and many-core architectures , 2015, Int. J. High Perform. Comput. Appl..

[124]  Didier Sornette,et al.  Robust dynamic classes revealed by measuring the response function of a social system , 2008, Proceedings of the National Academy of Sciences.

[125]  Nirmalie Wiratunga,et al.  Contextual sentiment analysis for social media genres , 2016, Knowl. Based Syst..

[126]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[127]  Roberto Basili,et al.  Language sensitive text classification , 2000, RIAO.

[128]  Stuart J. Rose,et al.  SociAL Sensor Analytics: Measuring phenomenology at scale , 2013, 2013 IEEE International Conference on Intelligence and Security Informatics.

[129]  Diego Reforgiato Recupero,et al.  Sentiment Analysis: Adjectives and Adverbs are Better than Adjectives Alone , 2007, ICWSM.

[130]  Peter N Bell New methodology for event studies in Bonds , 2010 .

[131]  W. Fuller,et al.  Distribution of the Estimators for Autoregressive Time Series with a Unit Root , 1979 .

[132]  Zheyi Chen,et al.  Detecting spammers on social networks , 2015, Neurocomputing.

[133]  Alessandro Moschitti,et al.  Twitter Sentiment Analysis with Deep Convolutional Neural Networks , 2015, SIGIR.

[134]  P. Phillips Testing for a Unit Root in Time Series Regression , 1988 .

[135]  Bernardo A. Huberman,et al.  Predicting the Future with Social Media , 2010, Web Intelligence.

[136]  Misako Takayasu,et al.  Empirical analysis of collective human behavior for extraordinary events in blogosphere , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[137]  Victor S. Sheng,et al.  Cost-Sensitive Learning and the Class Imbalance Problem , 2008 .

[138]  R I Kitney,et al.  Biomedical signal processing (in four parts) , 2006, Medical and Biological Engineering and Computing.

[139]  Janez Bester,et al.  Introduction to the Artificial Neural Networks , 2011 .

[140]  D. Kahneman,et al.  Delusions of success. How optimism undermines executives' decisions. , 2003, Harvard business review.

[141]  Richard J. Smith,et al.  Bounds testing approaches to the analysis of level relationships , 2001 .

[142]  Antonio Moreno,et al.  The Operational Value of Social Media Information , 2018 .

[143]  D. Ventosa-Santaulària,et al.  Spurious Regression , 2009 .

[144]  Michael Gertz,et al.  Efficient online extraction of keywords for localized events in twitter , 2017, GeoInformatica.

[145]  Dennis W. Jansen,et al.  Evaluating the ‘Fed Model’ of Stock Price Valuation: An out-of-sample forecasting perspective , 2006 .

[146]  Yuan-Fang Wang,et al.  The use of bigrams to enhance text categorization , 2002, Inf. Process. Manag..

[147]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[148]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[149]  Mark Dredze,et al.  You Are What You Tweet: Analyzing Twitter for Public Health , 2011, ICWSM.

[150]  Zhi Liu,et al.  LEDS: local event discovery and summarization from tweets , 2016, SIGSPATIAL/GIS.

[151]  S. Hudson,et al.  The Impact of Social Media on the Consumer Decision Process: Implications for Tourism Marketing , 2013 .

[152]  H. Eugene Stanley,et al.  Provided for non-commercial research and education use . Not for reproduction , distribution or commercial use , 2009 .

[153]  David Rogers A Review of Sales Forecasting Models Most Commonly Applied in Retail Site Evaluation , 1992 .

[154]  Zhao Liang,et al.  Multi-resolution Spatial Event Forecasting in Social Media , 2016 .

[155]  Philip C. Treleaven,et al.  Twitter Sentiment Analysis Applied to Finance: A Case Study in the Retail Industry , 2015, ArXiv.

[156]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.

[157]  Lei Zhang,et al.  Combining lexicon-based and learning-based methods for twitter sentiment analysis , 2011 .

[158]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[159]  Pei-Yu Sharon Chen,et al.  The Impact of Online Recommendations and Consumer Feedback on Sales , 2004, ICIS.

[160]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[161]  Claire Cardie,et al.  Annotating Expressions of Opinions and Emotions in Language , 2005, Lang. Resour. Evaluation.

[162]  Alain Yee-Loong Chong,et al.  Demand chain management: Relationships between external antecedents, web-based integration and service innovation performance , 2014 .

[163]  Pete Burnap,et al.  Arabic Event Detection in Social Media , 2015, CICLing.

[164]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[165]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[166]  Nicolau Santos,et al.  Performance of state space and ARIMA models for consumer retail sales forecasting , 2015 .

[167]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[168]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[169]  Matthew J. Schneider,et al.  Forecasting Sales of New and Existing Products Using Consumer Reviews: A Random Projections Approach , 2015 .

[170]  Saif Mohammad,et al.  NRC-Canada: Building the State-of-the-Art in Sentiment Analysis of Tweets , 2013, *SEMEVAL.

[171]  F. Hampel The Influence Curve and Its Role in Robust Estimation , 1974 .

[172]  Hong Xu,et al.  Event Detection Based on Interactive Communication Streams in Social Network , 2016, MobiMedia.

[173]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[174]  G. Zhang,et al.  A comparative study of linear and nonlinear models for aggregate retail sales forecasting , 2003 .

[175]  Alain Yee-Loong Chong,et al.  Predicting online product sales via online reviews, sentiments, and promotion strategies , 2016 .

[176]  Eric Horvitz,et al.  Mining the web to predict future events , 2013, WSDM.

[177]  Hannu Toivonen,et al.  Mining for similarities in aligned time series using wavelets , 1999, Defense, Security, and Sensing.

[178]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[179]  Rob J Hyndman,et al.  Another look at measures of forecast accuracy , 2006 .

[180]  Mark Levene,et al.  Combining lexicon and learning based approaches for concept-level sentiment analysis , 2012, WISDOM '12.

[181]  H. Stanley,et al.  Quantifying Trading Behavior in Financial Markets Using Google Trends , 2013, Scientific Reports.

[182]  Werner Antweiler,et al.  Is All that Talk Just Noise? The Information Content of Internet Stock Message Boards , 2001 .

[183]  Jian Ma,et al.  Sentiment classification: The contribution of ensemble learning , 2014, Decis. Support Syst..

[184]  Testing for cointegration using the Johansen methodology when variables are near-integrated , 2007 .

[185]  Uzay Kaymak,et al.  Exploiting emoticons in sentiment analysis , 2013, SAC '13.

[186]  Weiguo Fan,et al.  Do Facebook Activities Increase Sales? , 2015, AMCIS.

[187]  P. Phillips,et al.  Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root? , 1992 .

[188]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[189]  Chenliang Li,et al.  Twevent: segment-based event detection from tweets , 2012, CIKM.

[190]  Alexandre Plastino,et al.  A Statistical and Evolutionary Approach to Sentiment Analysis , 2014, 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT).

[191]  Piotr Indyk,et al.  Identifying Representative Trends in Massive Time Series Data Sets Using Sketches , 2000, VLDB.

[192]  C. J. van Rijsbergen,et al.  Information Retrieval , 1979, Encyclopedia of GIS.

[193]  H. Varian,et al.  Predicting the Present with Google Trends , 2009 .

[194]  Paul A. Pavlou,et al.  Can online reviews reveal a product's true quality?: empirical findings and analytical modeling of Online word-of-mouth communication , 2006, EC '06.

[195]  Sarah Vieweg,et al.  Processing Social Media Messages in Mass Emergency , 2014, ACM Comput. Surv..

[196]  Erik Cambria,et al.  Deep Convolutional Neural Network Textual Features and Multiple Kernel Learning for Utterance-level Multimodal Sentiment Analysis , 2015, EMNLP.

[197]  Tao Cheng,et al.  Event Detection using Twitter: A Spatio-Temporal Approach , 2014, PloS one.

[198]  Yoram Singer,et al.  BoosTexter: A Boosting-based System for Text Categorization , 2000, Machine Learning.

[199]  M. Qi,et al.  Forecasting Aggregate Retail Sales: a Comparison of Arti"cial Neural Networks and Traditional Methods , 2001 .

[200]  Ying Wah Teh,et al.  Time-series clustering - A decade review , 2015, Inf. Syst..

[201]  Tobias Preis,et al.  Adaptive nowcasting of influenza outbreaks using Google searches , 2014, Royal Society Open Science.

[202]  Shi Bing,et al.  Inductive learning algorithms and representations for text categorization , 2006 .

[203]  Vineet Virmani,et al.  Unit Root Tests: Results from some recent tests applied to select Indian macroeconomic variables , 2004 .

[204]  Carlos Castillo,et al.  AIDR: artificial intelligence for disaster response , 2014, WWW.

[205]  Charu C. Aggarwal,et al.  Time-Series Data Clustering , 2018, Data Clustering: Algorithms and Applications.

[206]  Huan Wang,et al.  An event detection method for social networks based on hybrid link prediction and quantum swarm intelligent , 2017, World Wide Web.

[207]  Jie Yin,et al.  Pinpointing Locational Focus in Microblogs , 2014, ADCS.

[208]  E. Brynjolfsson,et al.  The Future of Prediction: How Google Searches Foreshadow Housing Prices and Sales , 2013, ICIS 2013.

[209]  Jiebo Luo,et al.  Towards social imagematics: sentiment analysis in social multimedia , 2013, MDMKDD '13.

[210]  Forecasting systemic transitions in high dimensional stochastic complex systems , 2015 .

[211]  C. Granger,et al.  Spurious regressions in econometrics , 1974 .

[212]  Bo Pang,et al.  Thumbs up? Sentiment Classification using Machine Learning Techniques , 2002, EMNLP.

[213]  Qing Li,et al.  Dual graph regularized NMF model for social event detection from Flickr data , 2016, World Wide Web.

[214]  H. Gonçalves,et al.  Customer loyalty through social networks: Lessons from Zara on Facebook , 2014 .

[215]  Pascal Frossard,et al.  Multiscale event detection in social media , 2014, Data Mining and Knowledge Discovery.

[216]  Tong Bao,et al.  Why Amazon Uses Both the New York Times Best Seller List and Customer Reviews: An Empirical Study of Multiplier Effects on Product Sales from Multiple Earned Media , 2014, Decis. Support Syst..

[217]  Paul S. Jacobs,et al.  Joining Statistics with NLP for Text Categorization , 1992, ANLP.

[218]  Mizuki Morita,et al.  Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter , 2011, EMNLP.

[219]  Bruno Schivinski,et al.  The effect of social media communication on consumer perceptions of brands , 2016 .

[220]  Vaibhavi N Patodkar,et al.  Twitter as a Corpus for Sentiment Analysis and Opinion Mining , 2016 .

[221]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[222]  Annie Zaenen,et al.  Contextual Valence Shifters , 2006, Computing Attitude and Affect in Text.

[223]  Yaxin Bi,et al.  Improved lexicon-based sentiment analysis for social media analytics , 2015, Security Informatics.

[224]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[225]  Brian Dickinson,et al.  Sentiment Analysis of Investor Opinions on Twitter , 2015 .

[226]  Alper Sen,et al.  The US fashion industry: A supply chain review , 2008 .

[227]  B. Rosner Percentage Points for a Generalized ESD Many-Outlier Procedure , 1983 .

[228]  Philip J. Stone,et al.  A computer approach to content analysis: studies using the General Inquirer system , 1963, AFIPS Spring Joint Computing Conference.

[229]  Peter D. Turney Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews , 2002, ACL.

[230]  Elisabetta Fersini,et al.  Expressive signals in social media languages to improve polarity detection , 2016, Inf. Process. Manag..

[231]  Joel E. Thompson,et al.  More Methods That Make Little Difference In Event Studies , 1988 .

[232]  Asghar Zubair Lexicon-enhanced sentiment analysis framework using rule-based classification scheme , 2017 .

[233]  M. Shamim Hossain,et al.  Social Event Classification via Boosted Multimodal Supervised Latent Dirichlet Allocation , 2015, ACM Trans. Multim. Comput. Commun. Appl..

[234]  Yue Gao,et al.  Event Classification in Microblogs via Social Tracking , 2017, ACM Trans. Intell. Syst. Technol..

[235]  Kathleen R. McKeown,et al.  Predicting the semantic orientation of adjectives , 1997 .

[236]  Bin Gu,et al.  Do online reviews matter? - An empirical investigation of panel data , 2008, Decis. Support Syst..

[237]  Rob J Hyndman Measuring forecast accuracy , 2014 .

[238]  Wael Khreich,et al.  A Survey of Techniques for Event Detection in Twitter , 2015, Comput. Intell..

[239]  Boldt Linda Camilla,et al.  Forecasting Nike's sales using Facebook data , 2016 .

[240]  Ruey S. Tsay,et al.  Analysis of Financial Time Series , 2005 .

[241]  Changzhou Wang,et al.  Supporting content-based searches on time series via approximation , 2000, Proceedings. 12th International Conference on Scientific and Statistica Database Management.

[242]  Janyce Wiebe,et al.  Learning Subjective Adjectives from Corpora , 2000, AAAI/IAAI.

[243]  Remco M. Dijkman,et al.  Using Twitter to Predict Sales: A Case Study , 2015, ArXiv.

[244]  Francisco Herrera,et al.  A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms , 2011, Swarm Evol. Comput..

[245]  D. Sornette,et al.  Endogenous Versus Exogenous Shocks in Complex Networks: An Empirical Test Using Book Sale Rankings , 2003, Physical review letters.

[246]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[247]  J. MacKinnon,et al.  Econometric Theory and Methods , 2003 .

[248]  Shaowen Wang,et al.  GeoBurst: Real-Time Local Event Detection in Geo-Tagged Tweet Streams , 2016, SIGIR.

[249]  Aron Culotta,et al.  Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter messages , 2012, Language Resources and Evaluation.

[250]  Diansheng Guo,et al.  Urban event detection with big data of taxi OD trips: A time series decomposition approach , 2017, Trans. GIS.

[251]  Chris Hankin,et al.  The early bird catches the term: combining twitter and news data for event detection and situational awareness , 2015, Journal of Biomedical Semantics.

[252]  Alain Yee-Loong Chong,et al.  Predicting consumer product demands via Big Data: the roles of online promotional marketing and online reviews , 2017, Int. J. Prod. Res..

[253]  Christos Faloutsos,et al.  Nonlinear Dynamics of Information Diffusion in Social Networks , 2017, ACM Trans. Web.

[254]  Lei Chen,et al.  Event detection over twitter social media streams , 2013, The VLDB Journal.

[255]  George Forman,et al.  An Extensive Empirical Study of Feature Selection Metrics for Text Classification , 2003, J. Mach. Learn. Res..

[256]  Ladislav Kristoufek,et al.  Estimating suicide occurrence statistics using Google Trends , 2016, EPJ Data Science.