A framework for relationship extraction from unstructured text via link grammar parsing

A major task in information extraction is to extract relations between named entities. Relation extraction not only builds and extends knowledge bases and ontologies but also supports downstream application processing such as graph mining. In this paper, we report a relation extraction framework based on the natural language theory of link grammar. Our methodology uses and extends Akbik and Broß’s Wanderlust approach, where linguistic paths that are defined over the dependency grammar of sentences guide the relation extraction process. In particular, our framework splits a document into sentences, creates a dependency tree of each sentence, tags and categorizes entities, and extract relations between these entities. The accuracy of our framework is parametrized with the choice of linguistic paths, and accuracy scores as high as 95% precision, 36% recall, and 44% f-score are obtained. We also envision natural extensions of our work, where cross-sentence references are resolved and/or the context and content of the sentence constrains the linguistic paths.