Assessing the Reliability of Relevant Tweets and Validation Using Manual and Automatic Approaches for Flood Risk Communication

While Twitter has been touted as a preeminent source of up-to-date information on hazard events, the reliability of tweets is still a concern. Our previous publication extracted relevant tweets containing information about the 2013 Colorado flood event and its impacts. Using the relevant tweets, this research further examined the reliability (accuracy and trueness) of the tweets by examining the text and image content and comparing them to other publicly available data sources. Both manual identification of text information and automated (Google Cloud Vision, application programming interface (API)) extraction of images were implemented to balance accurate information verification and efficient processing time. The results showed that both the text and images contained useful information about damaged/flooded roads/streets. This information will help emergency response coordination efforts and informed allocation of resources when enough tweets contain geocoordinates or location/venue names. This research will identify reliable crowdsourced risk information to facilitate near real-time emergency response through better use of crowdsourced risk communication platforms.

[1]  Lars Hetmank,et al.  Components and Functions of Crowdsourcing Systems - A Systematic Literature Review , 2013, Wirtschaftsinformatik.

[2]  Kevin Crowston,et al.  From Conservation to Crowdsourcing: A Typology of Citizen Science , 2011, 2011 44th Hawaii International Conference on System Sciences.

[3]  Ali Shojaie,et al.  Using Twitter for Demographic and Social Science Research: Tools for Data Collection and Processing , 2014, Sociological methods & research.

[4]  Weiguo Fan,et al.  Assessing reliability of social media data: lessons from mining TripAdvisor hotel reviews , 2018, J. Inf. Technol. Tour..

[5]  Jie Li,et al.  Rethinking big data: A review on the data quality and usage issues , 2016 .

[6]  Kia Jahanbin,et al.  Using twitter and web news mining to predict COVID-19 outbreak , 2020 .

[7]  Amber Silver The use of social media in crisis communication , 2019, Risk Communication and Community Resilience.

[8]  Bandana Kar,et al.  Citizen science in risk communication in the era of ICT , 2016, Concurr. Comput. Pract. Exp..

[9]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[10]  Samuel Greengard,et al.  Following the crowd , 2011, Commun. ACM.

[11]  Calton Pu,et al.  Social spam, campaigns, misinformation and crowdturfing , 2014, WWW '14 Companion.

[12]  Kate Starbird,et al.  Examining the Alternative Media Ecosystem Through the Production of Alternative Narratives of Mass Shooting Events on Twitter , 2017, ICWSM.

[13]  Anupam Joshi,et al.  Faking Sandy: characterizing and identifying fake images on Twitter during Hurricane Sandy , 2013, WWW.

[14]  Alexander Zipf,et al.  The use of Volunteered Geographic Information (VGI) and Crowdsourcing in Disaster Management: a Systematic Literature Review , 2013, AMCIS.

[15]  D. Ruths,et al.  What's in a Name? Using First Names as Features for Gender Inference in Twitter , 2013, AAAI Spring Symposium: Analyzing Microtext.

[16]  Alison Greenhalgh Social Media Flooded with Rescue Requests During Hurricane Harvey , 2018 .

[17]  Shih-Hsin Chen,et al.  A Content-Based Image Retrieval Method Based on the Google Cloud Vision API and WordNet , 2017, ACIIDS.

[18]  Bernhard Höfle,et al.  Volunteered Geographic Information in Natural Hazard Analysis: A Systematic Literature Review of Current Approaches with a Focus on Preparedness and Mitigation , 2016, ISPRS Int. J. Geo Inf..

[19]  Gabriella Pasi,et al.  Credibility in social media: opinions, news, and health information—a survey , 2017, WIREs Data Mining Knowl. Discov..

[20]  Libby Hemphill,et al.  Quantifying Toxicity and Verbal Violence on Twitter , 2016, CSCW Companion.

[21]  James Campbell,et al.  Big Opportunities in Access to "Small Science" Data , 2007, Data Sci. J..

[22]  Sarah McCaffrey,et al.  Using Social Media to Predict Air Pollution during California Wildfires , 2018, SMSociety.

[23]  H. Raghav Rao,et al.  Community Intelligence and Social Media Services: A Rumor Theoretic Analysis of Tweets During Social Crises , 2013, MIS Q..

[24]  Ilan Noy,et al.  NATURAL DISASTERS , 2011 .

[25]  C. Havas,et al.  Combining machine-learning topic models and spatiotemporal analysis of social media data for disaster footprint and damage assessment , 2018 .

[26]  Bandana Kar,et al.  Assessing relevance of tweets for risk communication , 2018, Int. J. Digit. Earth.

[27]  Edin Mujkic,et al.  WEB 2.0: How social media applications leverage nonprofit responses during a wildfire crisis , 2016, Comput. Hum. Behav..

[28]  John Yen,et al.  Classifying text messages for the haiti earthquake , 2011, ISCRAM.

[29]  Mike Clarke,et al.  The Effectiveness of Disaster Risk Communication: A Systematic Review of Intervention Studies , 2014, PLoS currents.

[30]  Leysia Palen,et al.  Mastering social media: An analysis of Jefferson County's communications during the 2013 Colorado floods , 2014, ISCRAM.

[31]  Vincent T. Covello,et al.  Risk Communication: An Emerging Area of Health Communication Research , 1992 .

[32]  Farida Vis,et al.  Twitpic-ing the riots: analysing images shared on Twitter during the 2011 UK riots , 2013 .

[33]  James Campbell,et al.  Public Commons of Geographic Data: Research and Development Challenges , 2004, GIScience.

[34]  Catherine C. Marshall,et al.  A Human-Centered Framework for Ensuring Reliability on Crowdsourced Labeling Tasks , 2013, HCOMP.

[35]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[36]  OhSanghee,et al.  Motivations for sharing information and social support in social media , 2015 .

[37]  Mor Naaman,et al.  Unfolding the event landscape on twitter: classification and exploration of user categories , 2012, CSCW '12.

[38]  John D. Burger,et al.  Discriminating Gender on Twitter , 2011, EMNLP.

[39]  Xiaohui Liu,et al.  Evaluating Relevance and Reliability of Twitter Data for Risk Communication , 2017 .

[40]  Sanghee Oh,et al.  Motivations for sharing information and social support in social media: A comparative analysis of Facebook, Twitter, Delicious, YouTube, and Flickr , 2015, J. Assoc. Inf. Sci. Technol..

[41]  R. Irizarry,et al.  Mortality in Puerto Rico after Hurricane Maria , 2018, The New England journal of medicine.

[42]  Shahbaz Syed,et al.  The Twitter pandemic: The critical role of Twitter in the dissemination of medical information and misinformation during the COVID-19 pandemic , 2020, CJEM.

[43]  B. Newell,et al.  Rare disaster information can increase risk-taking , 2016 .

[44]  Sam Meek,et al.  A flexible framework for assessing the quality of crowdsourced data , 2014 .

[45]  Charlie K. Dagli,et al.  Twitter Language Identification Of Similar Languages And Dialects Without Ground Truth , 2017, VarDial.

[46]  Hywel T. P. Williams,et al.  Using Social Media to Detect and Locate Wildfires , 2016, EcoMo@ICWSM.

[47]  Carlos Frederico de Brito d'Andréa,et al.  Studying the Live Cross-Platform Circulation of Images With Computer Vision API: An Experiment Based on a Sports Media Event , 2019 .

[48]  Kate Starbird,et al.  Rumors, False Flags, and Digital Vigilantes: Misinformation on Twitter after the 2013 Boston Marathon Bombing , 2014 .