The Hurricane Sandy Twitter Corpus

The growing use of social media has made it a critical component of disaster response and recovery efforts. Both in terms of preparedness and response, public health officials and first responders have turned to automated tools to assist with organizing and visualizing large streams of social media. In turn, this has spurred new research into algorithms for information extraction, event detection and organization, and information visualization. One challenge of these efforts has been the lack of a common corpus for disaster response on which researchers can compare and contrast their work. This paper describes the Hurricane Sandy Twitter Corpus: 6.5 million geotagged Twitter posts from the geographic area and time period of the 2012 Hurricane Sandy.

[1]  R. Merchant,et al.  Integrating social media into emergency-preparedness efforts. , 2011, The New England journal of medicine.

[2]  Vassilis Kostakos,et al.  Towards Real-time Emergency Response using Crowd Supported Analysis of Social Media , 2011 .

[3]  Fernando Diaz,et al.  Extracting information nuggets from disaster- Related messages in social media , 2013, ISCRAM.

[4]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[5]  Graham Neubig,et al.  Safety Information Mining — What can NLP do in a disaster— , 2011, IJCNLP.

[6]  Mark Dredze,et al.  Could behavioral medicine lead the web data revolution? , 2014, JAMA.

[7]  Mark Dredze,et al.  How Social Media Will Change Public Health , 2012, IEEE Intelligent Systems.

[8]  Mohammad Ali Abbasi,et al.  TweetTracker: An Analysis Tool for Humanitarian and Disaster Relief , 2011, ICWSM.

[9]  N. Dufty Using social media to build community disaster resilience , 2012 .

[10]  M. Osborne,et al.  Bieber no more : First Story Detection using Twitter and Wikipedia , 2012 .

[11]  Miles Osborne,et al.  Streaming First Story Detection with application to Twitter , 2010, NAACL.

[12]  Chen Huang,et al.  Microblogging after a major disaster in China: a case study of the 2010 Yushu earthquake , 2011, CSCW.

[13]  Miles Osborne,et al.  The Edinburgh Twitter Corpus , 2010, HLT-NAACL 2010.

[14]  Adam Acar,et al.  Twitter for crisis communication: lessons learned from Japan's tsunami disaster , 2011, Int. J. Web Based Communities.

[15]  Huiji Gao,et al.  Harnessing the Crowdsourcing Power of Social Media for Disaster Relief , 2011, IEEE Intelligent Systems.

[16]  Eduard H. Hovy,et al.  Structured Event Retrieval over Microblog Archives , 2012, NAACL.

[17]  Dave Yates,et al.  Emergency knowledge management and social media technologies: A case study of the 2010 Haitian earthquake , 2011, Int. J. Inf. Manag..

[18]  GaoHuiji,et al.  Harnessing the Crowdsourcing Power of Social Media for Disaster Relief , 2011 .

[19]  Michael F. Goodchild,et al.  Please Scroll down for Article International Journal of Digital Earth Crowdsourcing Geographic Information for Disaster Response: a Research Frontier Crowdsourcing Geographic Information for Disaster Response: a Research Frontier , 2022 .

[20]  J. Beven,et al.  Tropical Cyclone Report Hurricane Sandy , 2013 .

[21]  Oren Etzioni,et al.  Open domain event extraction from twitter , 2012, KDD.

[22]  Jie Yin,et al.  Using Social Media to Enhance Emergency Situation Awareness , 2012, IEEE Intelligent Systems.