CrystalBall: A Visual Analytic System for Future Event Discovery and Analysis from Social Media Data

Social media data bear valuable insights regarding events that occur around the world. Events are inherently temporal and spatial. Existing visual text analysis systems have focused on detecting and analyzing past and ongoing events. Few have leveraged social media information to look for events that may occur in the future. In this paper, we present an interactive visual analytic system, CrystalBall, that automatically identifies and ranks future events from Twitter streams. CrystalBall integrates new methods to discover events with interactive visualizations that permit sensemaking of the identified future events. Our computational methods integrate seven different measures to identify and characterize future events, leveraging information regarding time, location, social networks, and the informativeness of the messages. A visual interface is tightly coupled with the computational methods to present a concise summary of the possible future events. A novel connection graph and glyphs are designed to visualize the characteristics of the future events. To demonstrate the efficacy of CrystalBall in identifying future events and supporting interactive analysis, we present multiple case studies and validation studies on analyzing events derived from Twitter data.

[1]  Jay Kreps,et al.  Kafka : a Distributed Messaging System for Log Processing , 2011 .

[2]  Angel X. Chang,et al.  SUTime: A library for recognizing and normalizing time expressions , 2012, LREC.

[3]  Michael Gertz,et al.  EvenTweet: Online Localized Event Detection from Twitter , 2013, Proc. VLDB Endow..

[4]  Daniel A. Keim,et al.  EventRiver: Visually Exploring Text Collections with Temporal References , 2012, IEEE Transactions on Visualization and Computer Graphics.

[5]  Gerlof Bouma,et al.  Normalized (pointwise) mutual information in collocation extraction , 2009 .

[6]  Daniel A. Keim,et al.  CloudLines: Compact Display of Event Episodes in Multiple Time-Series , 2011, IEEE Transactions on Visualization and Computer Graphics.

[7]  Daniel A. Keim,et al.  State-of-the-Art Report of Visual Analysis for Event Detection in Text Data Streams , 2014, EuroVis.

[8]  Heng Ji,et al.  Tweet Ranking Based on Heterogeneous Networks , 2012, COLING.

[9]  David H. Laidlaw,et al.  Representing Uncertainty in Graph Edges: An Evaluation of Paired Visual Variables , 2015, IEEE Transactions on Visualization and Computer Graphics.

[10]  Hila Becker,et al.  Learning similarity metrics for event identification in social media , 2010, WSDM '10.

[11]  Eric P. Xing,et al.  Sparse Additive Generative Models of Text , 2011, ICML.

[12]  Guofei Gu,et al.  Analyzing spammers' social networks for fun and profit: a case study of cyber criminal ecosystem on twitter , 2012, WWW.

[13]  Brendan T. O'Connor,et al.  Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters , 2013, NAACL.

[14]  Thomas Ertl,et al.  Spatiotemporal anomaly detection through visual analysis of geolocated Twitter messages , 2012, 2012 IEEE Pacific Visualization Symposium.

[15]  Dustin Arendt,et al.  ESTEEM: A Novel Framework for Qualitatively Evaluating and Visualizing Spatiotemporal Embeddings in Social Media , 2017, ACL.

[16]  William Ribarsky,et al.  Discover Diamonds-in-the-Rough Using Interactive Visual Analytics System-Tweets as a Collective Diary of the Occupy Movement , 2013, ICWSM 2013.

[17]  Oren Etzioni,et al.  Named Entity Recognition in Tweets: An Experimental Study , 2011, EMNLP.

[18]  Svitlana Volkova,et al.  Inferring Latent User Properties from Texts Published in Social Media , 2015, AAAI.

[19]  Henry A. Kautz,et al.  Predicting Disease Transmission from Geo-Tagged Micro-Blog Data , 2012, AAAI.

[20]  William Ribarsky,et al.  LeadLine: Interactive visual analysis of text data through event identification and exploration , 2012, 2012 IEEE Conference on Visual Analytics Science and Technology (VAST).

[21]  Yiming Yang,et al.  A study of retrospective and on-line event detection , 1998, SIGIR '98.

[22]  Reynold Xin,et al.  Apache Spark , 2016 .

[23]  R. D'Andrade,et al.  the colors of emotion1 , 1974 .

[24]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[25]  Eric Horvitz,et al.  Mining the web to predict future events , 2013, WSDM.

[26]  Michelle X. Zhou,et al.  Event detection with social media data , 2012 .

[27]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[28]  William Ribarsky,et al.  VAiRoma: A Visual Analytics System for Making Sense of Places, Times, and Events in Roman History , 2016, IEEE Transactions on Visualization and Computer Graphics.

[29]  Christos Faloutsos,et al.  RSC: Mining and Modeling Temporal Activity in Social Media , 2015, KDD.

[30]  Oren Etzioni,et al.  Open domain event extraction from twitter , 2012, KDD.

[31]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .