Automated Extraction of Conceptual Knowledge from a Chinese Machine-Readable Dictionary

In this paper, we exploit a Chinese machine-readable dictionary to extract the conceptual knowledge, i.e. the <attribute, value> pairs involving in hypernym, (artificiality) material, (artificiality) function and (medicine) usage from the corresponding definitions of nominal entries. Our method focuses on (1) constructing the extraction patterns and (2) the statistical decision for applying these patterns. Therefore our work is designed to be a new three-step procedure. Firstly, annotate the definitions of a number of nominal entries that are used as training samples of these four attributes and contextual linguistic features; secondly, design different patterns for extracting such conceptual knowledge, and learn the applicability of the patterns by a Maximum Entropy (ME) classifier to decide whether a pattern can be used in current context or not; at last, apply these patterns to the remaining nominal entries of the dictionary, and we achieve relatively satisfying results.