DISCRN: A Distributed Storytelling Framework for Intelligence Analysis

Storytelling connects entities (people, organizations) using their observed relationships to establish meaningful storylines. This can be extended to spatiotemporal storytelling that incorporates locations, time, and graph computations to enhance coherence and meaning. But when performed sequentially these computations become a bottleneck because the massive number of entities make space and time complexity untenable. This article presents DISCRN, or distributed spatiotemporal ConceptSearch-based storytelling, a distributed framework for performing spatiotemporal storytelling. The framework extracts entities from microblogs and event data, and links these entities using a novel ConceptSearch to derive storylines in a distributed fashion utilizing key-value pair paradigm. Performing these operations at scale allows deeper and broader analysis of storylines. The novel parallelization techniques speed up the generation and filtering of storylines on massive datasets. Experiments with microblog posts such as Twitter data and Global Database of Events, Language, and Tone events show the efficiency of the techniques in DISCRN.

[1]  James Bailey,et al.  Discovering correlated spatio-temporal changes in evolving graphs , 2007, Knowledge and Information Systems.

[2]  Stephanie W. Haas The Creative Process: A Computer Model of Storytelling and Creativity, by Scott R. Turner , 1996, J. Am. Soc. Inf. Sci..

[3]  Andrew McCallum,et al.  Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models , 2011, ACL.

[4]  M. Shahriar Hossain,et al.  Storytelling in entity networks to support intelligence analysts , 2012, KDD.

[5]  José Palazzo Moreira de Oliveira,et al.  Measuring node importance on Twitter microblogging , 2012, WIMS '12.

[6]  Naren Ramakrishnan,et al.  Algorithms for Storytelling , 2006, IEEE Transactions on Knowledge and Data Engineering.

[7]  Arnold P. Boedihardjo,et al.  Towards ontological similarity for spatial hierarchies , 2012, QUeST '12.

[8]  Dafna Shahaf,et al.  Connecting Two (or Less) Dots: Discovering Structure in News Articles , 2012, TKDD.

[9]  Christophe Claramunt,et al.  Modeling consistency of spatio-temporal graphs , 2013, Data Knowl. Eng..

[10]  Arnold P. Boedihardjo,et al.  Forecasting location-based events with spatio-temporal storytelling , 2014, LBSN '14.

[11]  Naren Ramakrishnan,et al.  Connecting the Dots between PubMed Abstracts , 2012, PloS one.

[12]  Arnold P. Boedihardjo,et al.  Spatio-Temporal Storytelling on Twitter , 2013 .

[13]  Shashi Shekhar,et al.  Spatio-Temporal Sensor Graphs (STSG): A data model for the discovery of spatio-temporal patterns , 2009, Intell. Data Anal..

[14]  Jeffrey D. Ullman,et al.  Optimizing joins in a map-reduce environment , 2010, EDBT '10.

[15]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[16]  Patrik Ritosa,et al.  An IO-efficient parallel implementation of an R2 viewshed algorithm for large terrain maps on a CUDA GPU , 2014, Int. J. Geogr. Inf. Sci..

[17]  James Bailey,et al.  Using graph partitioning to discover regions of correlated spatio-temporal change in evolving graphs , 2009, Intell. Data Anal..

[18]  Mark Dredze,et al.  Annotating Named Entities in Twitter Data with Crowdsourcing , 2010, Mturk@HLT-NAACL.

[19]  Matthew Michelson,et al.  Tweet Disambiguate Entities Retrieve Folksonomy SubTree Step 1 : Discover Categories Generate Topic Profile from SubTrees Step 2 : Discover Profile Topic Profile : “ English Football ” “ World Cup ” , 2010 .