Cross-lingual event-mining using wordnet as a shared knowledge interface

We describe a concept-based event-mining system that maximizes information extracted from text and is not restricted to predefined knowledge templates. Such a system needs to handle a wide range of expressions while being able to extract precise semantic relations. The system uses simple patterns of linguistic and ontological constraints that are applied to a uniform representation of the text. It uses a generic ontology based on DOLCE and wordnets in different languages to extract events from text in these languages in an interoperable way. The system performs unsupervised domain-independent event-mining with promising results. Error-analysis showed that the semantic model and the mapping of text to concepts through wordsense-disambiguation (WSD) are not the main cause of the errors but the complexity of the grammatical structures and the quality of parsing. Using the same semantic model and their cross-wordnet links, our English event-mining patterns were transferred to Dutch in less than a day’s work. The platform was tested on the environment domain but can be applied to any other domain.