Density-Based Spatiotemporal Clustering Algorithm for Extracting Bursty Areas from Georeferenced Documents

Nowadays, with the increasing attention being paid to social media, a huge number of georeferenced documents, which include location information, are posted on social media sites. People transmit and collect information over the Internet through these georeferenced documents. Georeferenced documents are usually related to not only personal topics but also local topics and events. Therefore, extracting bursty areas associated with local topics and events from georeferenced documents is one of the most important challenges in different application domains. In this paper, a novel spatiotemporal clustering algorithm, called the (ϵ,τ)-density-based spatiotemporal clustering algorithm, for extracting bursty areas from georeferenced documents is proposed. The proposed clustering algorithm can recognize not only temporally-separated but also spatially-separated clusters. To evaluate our proposed clustering algorithm, geo-tagged tweets posted on the Twitter site are used. The experimental results show that the (ϵ,τ)-density-based spatiotemporal clustering algorithm can extract bursty areas as (ϵ,τ)-density-based spatiotemporal clusters associated with local topics and events.

[1]  Michael F. Goodchild,et al.  Citizens as Voluntary Sensors: Spatial Data Infrastructure in the World of Web 2.0 , 2007, Int. J. Spatial Data Infrastructures Res..

[2]  Darren Gergle,et al.  On the "localness" of user-generated content , 2010, CSCW '10.

[3]  Ling Chen,et al.  Discovering personally semantic places from GPS trajectories , 2012, CIKM.

[4]  Mor Naaman,et al.  Towards automatic extraction of event and place semantics from flickr tags , 2007, SIGIR.

[5]  Keiji Yanai,et al.  Detecting cultural differences using consumer-generated geotagged photos , 2009, LOCWEB '09.

[6]  Hojung Cha,et al.  LifeMap: A Smartphone-Based Context Provider for Location-Based Services , 2011, IEEE Pervasive Computing.

[7]  Slava Kisilevich,et al.  P-DBSCAN: a density based clustering algorithm for exploration and analysis of attractive areas using collections of geo-tagged photos , 2010, COM.Geo '10.

[8]  Kazufumi Watanabe,et al.  Jasmine: a real-time local-event detection system based on geolocation information propagated to microblogs , 2011, CIKM '11.

[9]  Koji Zettsu,et al.  mTrend: discovery of topic movements on geo-microblogging messages , 2011, GIS.

[10]  Simone Palazzo,et al.  Event detection in underwater domain by exploiting fish trajectory clustering , 2012, MAED '12.

[11]  Jon M. Kleinberg,et al.  Mapping the world's photos , 2009, WWW '09.

[12]  James Allan,et al.  On-Line New Event Detection and Tracking , 1998, SIGIR.

[13]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[14]  Mor Naaman,et al.  Geographic information from georeferenced social media data , 2011, SIGSPACIAL.

[15]  Vania Bogorny,et al.  A clustering-based approach for discovering interesting places in trajectories , 2008, SAC '08.

[16]  Kazutoshi Sumiya,et al.  Measuring geographical regularities of crowd behaviors for Twitter-based geo-social event detection , 2010, LBSN '10.

[17]  Hans-Peter Kriegel,et al.  Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications , 1998, Data Mining and Knowledge Discovery.

[18]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[19]  Mor Naaman,et al.  Generating summaries and visualization for large collections of geo-referenced photographs , 2006, MIR '06.