Extraction and exploration of spatio-temporal information in documents

In the past couple of years, there have been significant advances in the areas of temporal information retrieval (TIR) and geographic information retrieval (GIR), each focusing on extracting and utilizing temporal and geographic information, respectively, from documents for search and exploration tasks. Interestingly, there is only little work that combines models, techniques and applications from these two areas to support scenarios and applications where temporal and geographic information in combination provide interesting meaningful nuggets in document exploration tasks, such as visualizing a chronological sequence of events with their locations. In this paper, we present an approach that combines the two areas of TIR and GIR. Using temporal and geographic information extracted from documents and recorded in temporal and geographic document profiles, we show how co-occurrences of such information are determined and spatio-temporal document profiles are computed. Such profiles then provide the basis for a variety of document search and exploration tasks, such as visualizing the sequences of events on a map. We present a prototypical implementation of our system and demonstrate the effectiveness of combining GIR and TIR in the context of document exploration tasks.

[1]  Fredric C. Gey,et al.  Biography as events in time and space , 2008, GIS '08.

[2]  Cheng Niu,et al.  Location Normalization for Information Extraction , 2002, COLING.

[3]  Hanan Samet,et al.  STEWARD: architecture of a spatio-textual search engine , 2007, GIS.

[4]  Richard M. Gale,et al.  The language of time , 1969 .

[5]  Editors , 1986, Brain Research Bulletin.

[6]  B. D. Jouvenel,et al.  The Language of Time , 1972 .

[7]  Branimir Boguraev,et al.  TimeBank-Driven TimeML Analysis , 2005, Annotating, Extracting and Reasoning about Time and Events.

[8]  Rittwik Jana,et al.  Geotracker: geospatial and temporal RSS navigation , 2007, WWW '07.

[9]  Michael Gertz,et al.  On the value of temporal information in information retrieval , 2007, SIGF.

[10]  Ricardo Baeza-Yates,et al.  Clustering and exploring search results using timeline constructions , 2009, CIKM.

[11]  David A. Ferrucci,et al.  Building an example application with the Unstructured Information Management Architecture , 2004, IBM Syst. J..

[12]  James Pustejovsky,et al.  SemEval-2007 Task 15: TempEval Temporal Relation Identification , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[13]  Jochen L. Leidner Toponym resolution in text: annotation, evaluation and applications of spatial grounding , 2007, SIGF.

[14]  James Pustejovsky,et al.  Annotating, Extracting and Reasoning About Time and Events , 2005, Annotating, Extracting and Reasoning about Time and Events.

[15]  Jochen L. Leidner,et al.  Grounding spatial named entities for information extraction and question answering , 2003, HLT-NAACL 2003.

[16]  Fredric C. Gey,et al.  GeoCLEF 2008: the CLEF 2008 Cross-Language Geographic Information Retrieval Track Overview , 2008, CLEF.

[17]  José Borbinha,et al.  A geo-temporal information extraction service for processing descriptive metadata in digital libraries , 2009 .

[18]  Marc Moens,et al.  Named Entity Recognition without Gazetteers , 1999, EACL.

[19]  Frank Schilder,et al.  From Temporal Expressions To Temporal Information: Semantic Tagging Of News Messages , 2001, The Language of Time - A Reader.

[20]  Ricardo Baeza-Yates,et al.  Effectiveness of Temporal Snippets , 2009 .

[21]  Christopher B. Jones,et al.  Proceedings of the 5th ACM Workshop On Geographic Information Retrieval, GIR 2008, Napa Valley, California, USA, October 29-30, 2008 , 2008, GIR.

[22]  James Pustejovsky,et al.  TimeML: Robust Specification of Event and Temporal Expressions in Text , 2003, New Directions in Question Answering.

[23]  Vassilis J. Tsotras Recent Advances on Querying and Managing Trajectories , 2007, SSTDi.

[24]  Satoshi Sekine,et al.  A survey of named entity recognition and classification , 2007 .

[25]  Inderjeet Mani,et al.  Robust Temporal Processing of News , 2000, ACL.

[26]  James Pustejovsky,et al.  Temporal Processing with the TARSQI Toolkit , 2008, COLING.

[27]  José Luis Borbinha,et al.  Extracting and Exploring the Geo-Temporal Semantics of Textual Resources , 2008, 2008 IEEE International Conference on Semantic Computing.

[28]  Marie-Francine Moens,et al.  Meeting TempEval-2: Shallow Approach for Temporal Tagger , 2009, SEW@NAACL-HLT.