A Cross Language Document Retrieval System Based on Semantic Annotation

The paper describes a cross-lingual document retrieval system in the medical domain that employs a controlled vocabulary (UMLS1) in constructing an XML-based intermediary representation into which queries as well as documents are mapped. The system assists in the retrieval of English and German medical scientific abstracts relevant to a German query document (electronic patient record). The modularity of the system allows for deployment in other domains, given appropriate linguistic and semantic resources.