A Framework for Integrating Deep and Shallow Semantic Structures in Text Mining

Recent work in knowledge representation undertaken as part of the Semantic Web initiative has enabled a common infrastructure (Resource Description Framework (RDF) and RDF Schema) for sharing knowledge of ontologies and instances. In this paper we present a framework for combining the shallow levels of semantic description commonly used in MUC-style information extraction with the deeper semantic structures available in such ontologies. The framework is implemented within the PIA project software called Ontology Forge. Ontology Forge offers a server-based hosting environment for ontologies, a server-side information extraction system for reducing the effort of writing annotations and a many-featured ontology/annotation editor. We discuss the knowledge framework, some features of the system and summarize results from extended named entity experiments designed to capture instances in texts using support vector machine software.