Integrating Information Extraction and Automatic Hyperlinking

This paper presents a novel information system integrating advanced information extraction technology and automatic hyper-linking. Extracted entities are mapped into a domain ontology that relates concepts to a selection of hyperlinks. For information extraction, we use SProUT, a generic platform for the development and use of multilingual text processing components. By combining finite-state and unification-based formalisms, the grammar formalism used in SProUT offers both processing efficiency and a high degree of decalrativeness. The ExtraLink demo system show-cases the extraction of relevant concepts from German texts in the tourism domain, offering the direct connection to associated web documents on demand.