Geo-knowledge-guided GPT models improve the extraction of location descriptions from disaster-related social media messages

Abstract Social media messages posted by people during natural disasters often contain important location descriptions, such as the locations of victims. Recent research has shown that many of these location descriptions go beyond simple place names, such as city names and street names, and are difficult to extract using typical named entity recognition (NER) tools. While advanced machine learning models could be trained, they require large labeled training datasets that can be time-consuming and labor-intensive to create. In this work, we propose a method that fuses geo-knowledge of location descriptions and a Generative Pre-trained Transformer (GPT) model, such as ChatGPT and GPT-4. The result is a geo-knowledge-guided GPT model that can accurately extract location descriptions from disaster-related social media messages. Also, only 22 training examples encoding geo-knowledge are used in our method. We conduct experiments to compare this method with nine alternative approaches on a dataset of tweets from Hurricane Harvey. Our method demonstrates an over 40% improvement over typically used NER approaches. The experiment results also show that geo-knowledge is indispensable for guiding the behavior of GPT models. The extracted location descriptions can help disaster responders reach victims more quickly and may even save lives.

[1]  F. Fischer,et al.  ChatGPT for good? On opportunities and challenges of large language models for education , 2023, Learning and Individual Differences.

[2]  Henrique Pondé de Oliveira Pinto,et al.  GPT-4 Technical Report , 2023, 2303.08774.

[3]  E. A. V. van Dis,et al.  ChatGPT: five priorities for research , 2023, Nature.

[4]  S. Ermon,et al.  Towards a foundation model for geospatial artificial intelligence (vision paper) , 2022, SIGSPATIAL/GIS.

[5]  J. Morley,et al.  Transformer based named entity recognition for place name extraction from unstructured text , 2022, Int. J. Geogr. Inf. Sci..

[6]  Mansi A. Radke,et al.  Disambiguating spatial prepositions: The case of geo‐spatial sense detection , 2022, Trans. GIS.

[7]  Yeran Sun,et al.  GazPNE2: A General Place Name Extractor for Microblogs Fusing Gazetteers and Pretrained Transformer Models , 2022, IEEE Internet of Things Journal.

[8]  T. Tenbrink,et al.  Speaking of location: a review of spatial language research , 2022, Spatial Cogn. Comput..

[9]  Monika Sester,et al.  Extraction and analysis of natural disaster-related VGI from social media: review, opportunities and challenges , 2022, Int. J. Geogr. Inf. Sci..

[10]  L. Zou,et al.  VictimFinder: Harvesting rescue requests in disaster response from social media with BERT , 2022, Comput. Environ. Urban Syst..

[11]  Hassan Sajjad,et al.  When a disaster happens, we are ready: Location Mention Recognition from crisis tweets , 2022, International Journal of Disaster Risk Reduction.

[12]  E. Xoplaki,et al.  Facilitating adoption of AI in natural disaster management through collaboration , 2022, Nature Communications.

[13]  Ryan J. Lowe,et al.  Training language models to follow instructions with human feedback , 2022, NeurIPS.

[14]  Ana Bárbara Cardoso,et al.  A Novel Deep Learning Approach Using Contextual Embeddings for Toponym Resolution , 2021, ISPRS Int. J. Geo Inf..

[15]  Christopher B. Jones,et al.  Detecting geospatial location descriptions in natural language text , 2021, Int. J. Geogr. Inf. Sci..

[16]  Manzhu Yu,et al.  Geographic context-aware text mining: enhance social media message classification for situational awareness by integrating spatial and temporal features , 2021, Int. J. Digit. Earth.

[17]  Michael S. Bernstein,et al.  On the Opportunities and Risks of Foundation Models , 2021, ArXiv.

[18]  Krzysztof Janowicz,et al.  Geographic Question Answering: Challenges, Uniqueness, Classification, and Future Directions , 2021, AGILE: GIScience Series.

[19]  A. Mostafavi,et al.  Revealing Unfairness in social media contributors’ attention to vulnerable urban areas during disasters , 2021 .

[20]  Robert Dale,et al.  GPT-3: What’s it good for? , 2020, Natural Language Engineering.

[21]  Yingjie Hu,et al.  How Do People Describe Locations During a Natural Disaster: An Analysis of Tweets from Hurricane Harvey , 2020, GIScience.

[22]  Yingjie Hu,et al.  Understanding the removal of precise geotagging in tweets , 2020, Nature Human Behaviour.

[23]  Antonio Torralba,et al.  Using AI and Social Media Multimodal Content for Disaster Response and Management: Opportunities, Challenges, and Future Directions , 2020, Inf. Process. Manag..

[24]  Dhiraj Murthy,et al.  Machine-learning methods for identifying social media-based requests for urgent help during hurricanes , 2020 .

[25]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[26]  Jimin Wang,et al.  NeuroTPR: A neuro‐net toponym recognition model for extracting locations from social media messages , 2020, Trans. GIS.

[27]  Kejin Wang,et al.  Use of Twitter in disaster rescue: lessons learned from Hurricane Harvey , 2020, Int. J. Digit. Earth.

[28]  Zikai Zhou,et al.  Tracking Flooding Phase Transitions and Establishing a Passive Hotline With AI-Enabled Social Media Data , 2020, IEEE Access.

[29]  Yingjie Hu,et al.  Are we there yet?: evaluating state-of-the-art neural network based geoparsers using EUPEG as a benchmarking platform , 2019, GeoHumanities@SIGSPATIAL.

[30]  Yingjie Hu,et al.  Enhancing spatial and textual analysis with EUPEG: An extensible and unified platform for evaluating geoparsers , 2019, Trans. GIS.

[31]  Nastaran Pourebrahim,et al.  Understanding communication dynamics on Twitter during natural disasters: A case study of Hurricane Sandy , 2019, International Journal of Disaster Risk Reduction.

[32]  Qunying Huang,et al.  Deep learning for real-time social media text classification for situation awareness – using Hurricanes Sandy, Harvey, and Irma as case studies , 2019, Int. J. Digit. Earth.

[33]  Alan M. MacEachren,et al.  GeoTxt: A scalable geoparsing system for unstructured text geolocation , 2019, Trans. GIS.

[34]  Chenliang Li,et al.  A Survey on Deep Learning for Named Entity Recognition , 2018, IEEE Transactions on Knowledge and Data Engineering.

[35]  Mohammad Taher Pilehvar,et al.  A pragmatic guide to geoparsing evaluation , 2018, Language Resources and Evaluation.

[36]  Vidhyacharan Bhaskar,et al.  Big data analytics for disaster response and recovery through sentiment analysis , 2018, Int. J. Inf. Manag..

[37]  Grant McKenzie,et al.  A natural language processing and geospatial clustering framework for harvesting local place names from geotagged housing advertisements , 2018, Int. J. Geogr. Inf. Sci..

[38]  Nigel Collier,et al.  Which Melbourne? Augmenting Geocoding with Maps , 2018, ACL.

[39]  Lei Zou,et al.  Mining Twitter Data for Improved Understanding of Disaster Resilience , 2018 .

[40]  Sinan Aral,et al.  The spread of true and false news online , 2018, Science.

[41]  Christopher B. Jones,et al.  Geographic Information Retrieval: Progress and Challenges in Spatial Search of Text , 2018, Found. Trends Inf. Retr..

[42]  Avijit Ghosh,et al.  SAVITR: A System for Real-time Location Extraction from Microblogs during Emergencies , 2018, WWW.

[43]  Alan M. MacEachren,et al.  GeoCorpora: building a corpus to test and train microblog geoparsers , 2018, Int. J. Geogr. Inf. Sci..

[44]  S. Cutter,et al.  Leveraging Twitter to gauge evacuation compliance: Spatiotemporal analysis of Hurricane Matthew , 2017, PloS one.

[45]  Nigel Collier,et al.  What’s missing in geographical parsing? , 2017, Language Resources and Evaluation.

[46]  Bruno Martins,et al.  Automated Geocoding of Textual Documents: A Survey of Current Approaches , 2017, Trans. GIS.

[47]  Ming-Hsiang Tsou,et al.  Spatial, temporal, and content analysis of Twitter for wildfire hazards , 2016, Natural Hazards.

[48]  Qunying Huang,et al.  Geographic Situational Awareness: Mining Tweets for Disaster Preparedness, Emergency Response, Impact, and Recovery , 2015, ISPRS Int. J. Geo Inf..

[49]  Alexander Zipf,et al.  A geographic approach for combining social media and authoritative data towards identifying useful information for disaster management , 2015, Int. J. Geogr. Inf. Sci..

[50]  Jason Baldridge,et al.  Gazetteer-Independent Toponym Resolution Using Geographic Word Profiles , 2015, AAAI.

[51]  James B. Elsner,et al.  The increasing efficiency of tornado days in the United States , 2014, Climate Dynamics.

[52]  Krzysztof Janowicz,et al.  Improving wikipedia-based place name disambiguation in short texts using structured data from DBpedia , 2014, GIR.

[53]  Timothy Baldwin,et al.  Automatic Identification of Locative Expressions from Social Media Text: A Comparative Analysis , 2014, LocWeb '14.

[54]  Sarah Vieweg,et al.  Processing Social Media Messages in Mass Emergency , 2014, ACM Comput. Surv..

[55]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[56]  Carlos Castillo,et al.  AIDR: artificial intelligence for disaster response , 2014, WWW.

[57]  Hanan Samet,et al.  GeoWhiz: toponym resolution using common categories , 2013, SIGSPATIAL/GIS.

[58]  Judith Gelernter,et al.  An algorithm for local geoparsing of microtext , 2013, GeoInformatica.

[59]  Anupam Joshi,et al.  Faking Sandy: characterizing and identifying fake images on Twitter during Hurricane Sandy , 2013, WWW.

[60]  Anthony Stefanidis,et al.  #Earthquake: Twitter as a Distributed Sensor System , 2013, Trans. GIS.

[61]  Xiao Zhang,et al.  SensePlace2: GeoTwitter analytics support for situational awareness , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[62]  Judith Gelernter,et al.  Geo‐parsing Messages from Microtext , 2011, Trans. GIS.

[63]  José Luis Borbinha,et al.  A metadata geoparsing system for place name recognition and resolution in metadata records , 2011, JCDL '11.

[64]  P. Meier The Unprecedented Role of SMS in Disaster Response: Learning from Haiti , 2010 .

[65]  Inderjeet Mani,et al.  SpatialML: annotation scheme, resources, and evaluation , 2010, Lang. Resour. Evaluation.

[66]  G. Holland,et al.  Tropical cyclones and climate change , 2010 .

[67]  Bertrand De Longueville,et al.  "OMG, from here, I can see the flames!": a use case of mining location based social networks to acquire spatio-temporal data on forest fires , 2009, LBSN '09.

[68]  M. Goodchild,et al.  GIS and spatial data analysis: Converging perspectives , 2003 .

[69]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[70]  Linda L. Hill,et al.  Core Elements of Digital Gazetteers: Placenames, Categories, and Footprints , 2000, ECDL.

[71]  David Mhlanga,et al.  Open AI in Education, the Responsible and Ethical Use of ChatGPT Towards Lifelong Learning , 2023, SSRN Electronic Journal.

[72]  Ruihong Huang,et al.  Crossroads, Buildings and Neighborhoods: A Dataset for Fine-grained Location Recognition , 2022, NAACL.

[73]  S. Mathew,et al.  Automated Disaster News Collection Classification and Geoparsing , 2021, SSRN Electronic Journal.

[74]  Muhammad Imran,et al.  Exploring the usefulness and feasibility of software requirements for social media use in emergency management , 2020, International Journal of Disaster Risk Reduction.

[75]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[76]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[77]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[78]  Kathleen M. Carley,et al.  An approach to selecting keywords to track on twitter during a disaster , 2014, ISCRAM.

[79]  James Pustejovsky,et al.  A Linguistically Grounded Annotation Language for Spatial Information , 2012, TAL.

[80]  Jeannie A. Stamberger,et al.  Tweak the tweet: Leveraging microblogging proliferation with a prescriptive syntax to support citizen reporting , 2010, ISCRAM.

[81]  Craig A. Knoblock,et al.  From Text to Geographic Coordinates: The Current State of Geocoding , 2007 .

[82]  M. Goodchild,et al.  Uncertainty in geographical information , 2002 .

[83]  M. Goodchild,et al.  Geographic Information Systems and Science (second edition) , 2001 .

[84]  Christopher B. Jones,et al.  Please Scroll down for Article International Journal of Geographical Information Science Geographical Information Retrieval Editorial Geographical Information Retrieval , 2022 .