Using Generic Ontologies to Infer the Geographic Focus of Text

Certain documents are naturally associated with a country as their geographic focus. Some past work has sought to develop systems that identify this focus, under the assumption that the target country is explicitly mentioned in the document. When this assumption is not met, the task becomes one of inferring the focus based on the available context provided by the document. Although some existing work has considered this variant of the task, that work typically relies on the use of specialized geographic resources. In this work we seek to demonstrate that this inference task can be tackled by using generic ontologies, like ConceptNet and YAGO, that have been developed independently of the particular task. We describe GeoMantis, our developed system for inferring the geographic focus of a document, and we undertake a comparative evaluation against two freely-available open-source systems. Our results show that GeoMantis performs better than these two systems when the comparison is made on news stories whose target country is either not explicitly mentioned, or has been artificially obscured, in the story text.

[1]  Hanan Samet,et al.  NewsStand: a new view on news , 2008, GIS '08.

[2]  Daniele P. Radicioni,et al.  From human to artificial cognition and back: New perspectives on cognitively inspired AI systems , 2015, Cognitive Systems Research.

[3]  Loizos Michael,et al.  GeoMantis: Inferring the Geographic Focus of Text using Knowledge Bases , 2018, ICAART.

[4]  Abdelmounaam Rezgui,et al.  ConceptRDF: An RDF presentation of ConceptNet knowledge base , 2016, 2016 7th International Conference on Information and Communication Systems (ICICS).

[5]  Loizos Michael,et al.  A Hybrid Approach to Commonsense Knowledge Acquisition , 2016, STAIRS.

[6]  Catherine Dominguès,et al.  TEXTOMAP: determining geographical window for texts , 2015, GIR.

[7]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[8]  Davood Rafiei,et al.  Geotagging Named Entities in News and Online Documents , 2016, CIKM.

[9]  Catherine Havasi,et al.  ConceptNet 5: A Large Semantic Network for Relational Knowledge , 2013, The People's Web Meets NLP.

[10]  Gosse Bouma,et al.  Every document has a geographical scope , 2012, Data Knowl. Eng..

[11]  Jochen L. Leidner,et al.  Detecting geographical references in the form of place names and associated spatial natural language , 2011, SIGSPACIAL.

[12]  Hanan Samet,et al.  Determining the spatial reader scopes of news sources using local lexicons , 2010, GIS '10.

[13]  Virginia Dignum,et al.  Responsible Autonomy , 2017, IJCAI.

[14]  Mário J. Silva,et al.  Adding geographic scopes to web resources , 2006, Comput. Environ. Urban Syst..

[15]  G. Bower Experiments on Story Understanding and Recall * , 1976 .

[16]  Xinlei Chen,et al.  Never-Ending Learning , 2012, ECAI.

[17]  Gerhard Weikum,et al.  YAGO: A Large Ontology from Wikipedia and WordNet , 2008, J. Web Semant..

[18]  Andrew Halterman,et al.  Mordecai: Full Text Geoparsing and Event Geocoding , 2017, J. Open Source Softw..

[19]  Laura A. Dabbish,et al.  Designing games with a purpose , 2008, CACM.

[20]  Michael Friedewald,et al.  The Illusion of Security: A fiction scenario of daily life , 2008 .

[21]  Ulf Leser,et al.  Querying Distributed RDF Data Sources with SPARQL , 2008, ESWC.

[22]  Matteo Cristani,et al.  A Multimodal Approach to Relevance and Pertinence of Documents , 2016, IEA/AIE.

[23]  Sridhar Krishnamurti,et al.  Application of Neural Network Modeling to Identify Auditory Processing Disorders in School-Age Children , 2015, Adv. Artif. Neural Syst..

[24]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[25]  Ron Sivan,et al.  Web-a-where: geotagging web content , 2004, SIGIR '04.

[26]  Kohei Watanabe,et al.  Newsmap: A semi-supervised approach to geographical news classification , 2018 .

[27]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[28]  Pablo de la Fuente,et al.  Extracting Geographic Context from the Web: GeoReferencing in MyMoSe , 2009, ECIR.

[29]  Stellan Ohlsson,et al.  Verbal IQ of a Four-Year Old Achieved by an AI System , 2013, AAAI.

[30]  Bruno Martins,et al.  Automated Geocoding of Textual Documents: A Survey of Current Approaches , 2017, Trans. GIS.

[31]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[32]  Michael Günther,et al.  Introducing Wikidata to the Linked Data Web , 2014, SEMWEB.

[33]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[34]  Bhavani M. Thuraisingham,et al.  Focus location extraction from political news reports with bias correction , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[35]  Gerhard Weikum,et al.  YAGO2: exploring and querying world knowledge in time, space, context, and many languages , 2011, WWW.

[36]  Avi Arampatzis,et al.  The design and implementation of SPIRIT: a spatially aware search engine for information retrieval on the Internet , 2007, Int. J. Geogr. Inf. Sci..

[37]  Clodoveu A. Davis,et al.  A survey on the geographic scope of textual documents , 2016, Comput. Geosci..

[38]  Barbara Tversky,et al.  Cognitive Maps, Cognitive Collages, and Spatial Mental Models , 1993, COSIT.

[39]  Clodoveu A. Davis,et al.  Geotagging Aided by Topic Detection with Wikipedia , 2011, AGILE Conf..

[40]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[41]  Allison Woodruff,et al.  GIPSY: automated geographic indexing of text documents , 1994 .