Passage Extraction in Geographical Documents

This paper presents a project whose aim is to retrieve information in geographical documents. It relies on the generic structure of geographical information which relates some phenomena (for example of sociological or economic nature) with localisations in space and time. The system includes semantic analysers of spatial and temporal expressions, a term extractor (for phenomena), and a discourse analysis module linking the three components altogether, mostly relying on Charolles’ discourse universes model. Documents are processed off-line and the results are stored thanks to an XML markup, ready for queries combining the three components of geographical information. Ranked lists of dynamically-bounded passages are returned as answers.