Towards a Workbench for Acquisition of Domain Knowledge from Natural Language

In this paper we describe an architecture and functionality of main components of a workbench for an acquisition of domain knowledge from large text corpora. The workbench supports an incremental process of corpus analysis starting from a rough automatic extraction and organization of lexico-semantic regularities and ending with a computer supported analysis of extracted data and a semiautomatic refinement of obtained hypotheses. For doing this the workbench employs methods from computational linguistics, information retrieval and knowledge engineering. Although the work-bench is currently under implementation some of its components are already implemented and their performance is illustrated with samples from engineering for a medical domain.