论文信息 - Evaluating Relevance and Reliability of Twitter Data for Risk Communication

Evaluating Relevance and Reliability of Twitter Data for Risk Communication

....................................................................................................................... ii ACKNOWLEDGMENTS ................................................................................................. iv DEDICATION .................................................................................................................... v LIST OF TABLES .............................................................................................................. x LIST OF ILLUSTRATIONS ............................................................................................. xi LIST OF ABBREVIATIONS ........................................................................................... xii CHAPTER I INTRODUCTION ...................................................................................... 1 1.1 Overview. .................................................................................................................. 1 1.2 Problem Statement. ................................................................................................... 1 1.2.1 Natural Hazards. ................................................................................................ 1 1.2.2 Risk Information. ............................................................................................... 2 1.2.3 Crowdsourcing in Emergency Response. .......................................................... 2 1.2.4 Data Relevance and Reliability. ......................................................................... 4 1.3 Research Objectives and Outcomes. ......................................................................... 5 1.3.1 Objectives. ......................................................................................................... 5 1.3.2 Outcomes. .......................................................................................................... 6 CHAPTER II BACKGROUND ....................................................................................... 7 2.1 Risk Communication. ............................................................................................... 7 2.2 Crowdsourcing. ....................................................................................................... 10 vii 2.3 Crowdsourced Data Quality. ................................................................................... 15 2.3.1 Crowdsourced Data Relevance and Reliability. .............................................. 16 2.4 Techniques for Analyzing Crowdsourced Data. ..................................................... 17 2.4.1 Data Quality Assessment Techniques. ............................................................. 20 2.5 Summary ................................................................................................................. 23 CHAPTER III METHODOLOGY ................................................................................. 25 3.1 Study Site. ............................................................................................................... 25 3.2 Data Sets and Processing. ....................................................................................... 26 3.2.1 Tweets of 2013 Colorado Floods. .................................................................... 26 3.2.1.1 Tweets. ...................................................................................................... 26 3.2.1.2 Tools & Preprocessing. ............................................................................. 27 3.2.2 Geospatial Data. ............................................................................................... 30 3.2.3 Survey Data. ..................................................................................................... 30 3.2.4 NOAA Warning/alert Messages. ..................................................................... 31 3.2.5 Official Warning and Damage Assessment Reports. ....................................... 31 3.3 Analytics and Techniques ....................................................................................... 32 3.3.1 Extraction of Relevant Risk Information: Bag-of-words Model. .................... 32 3.3.2 Survey Responses to Warning/alert Message Content. ................................... 33 3.3.3 Evaluation of Relevance. ................................................................................. 33 3.3.4 Evaluation of Reliability. ................................................................................. 36 viii CHAPTER IV – RESULTS .............................................................................................. 38 4.1 Evaluation of Relevance. ........................................................................................ 38 4.1.1 Temporal Trend of Tweets Volume vs. Precipitation Amount. ....................... 38 4.1.2 Spatial Distribution of Tweets vs. the Degree of Damage. .............................. 39 4.1.3 Spatiotemporal Analysis of Tweets. ................................................................ 42 4.1.4 Content Analysis. ............................................................................................. 44 4.1.5 Cosine Similarity Comparison. ........................................................................ 46 4.1.6 Relevance Score ............................................................................................... 49 4.2 Evaluation of Reliability. ........................................................................................ 50 4.2.1 Evaluation of Text Content. ............................................................................. 50 4.2.2 Evaluation of Image. ........................................................................................ 56 CHAPTER V DISCUSION AND CONCLUSION .......................................................... 60 5.1 Relevance of Tweets to Risk Communication. ....................................................... 60 5.2 Reliability of Tweets to Risk Communication........................................................ 62 5.3 Research Outcomes. ................................................................................................ 62 5.3.1 Implications for Risk Communication. ............................................................ 62 5.3.2 Implications for GIScience. ............................................................................. 64 5.4 Limitations and Future Research. ........................................................................... 65 APPENDIX A Code ....................................................................................................... 69 A.1 MongoDB Code ..................................................................................................... 69 ix A.2 R Code.................................................................................................................... 70 APPENDIX B – Top Frequent Words & Hashtags .......................................................... 72 APPENDIX C – Examples of Identified Road/Streets ..................................................... 77 REFERENCES ................................................................................................................. 78

Xiaohui Liu | Xiaohui Liu

[1] Qunying Huang,et al. Using Twitter for tasking remote-sensing data collection and damage assessment: 2013 Boulder flood case study , 2016 .

[2] Shari R. Veil,et al. A Work-in-Process Literature Review: Incorporating Social Media in Risk and Crisis Communication , 2011 .

[3] Melinda Laituri,et al. On Line Disaster Response Community: People as Sensors of High Magnitude Disasters Using Internet GIS , 2008, Sensors.

[4] Pascal Neis,et al. Quality assessment for building footprints data on OpenStreetMap , 2014, Int. J. Geogr. Inf. Sci..

[5] J. Fowler,et al. Rapid assessment of disaster damage using social media activity , 2016, Science Advances.

[6] Bandana Kar,et al. Citizen science in risk communication in the era of ICT , 2016, Concurr. Comput. Pract. Exp..

[7] Pierre Tirilly,et al. Language modeling for bag-of-visual words image categorization , 2008, CIVR '08.

[8] Wael Khreich,et al. A Survey of Techniques for Event Detection in Twitter , 2015, Comput. Intell..

[9] Michael F. Goodchild,et al. Assuring the quality of volunteered geographic information , 2012 .

[10] M. Simpson. Global Climate Change Impacts in the United States , 2011 .

[11] Farida Vis,et al. Twitpic-ing the riots: analysing images shared on Twitter during the 2011 UK riots , 2013 .