Automatic Domain-Ontology Structure and Example Acquisition from Semi-Structured Texts

This paper presents a new method to acquire Domain-Ontology structure and examples from semi-structured data sources. Firstly, extract Domain-Ontology structure, including candidate attributes extraction using certain patterns and applying a statistic method to filter out the incorrect attributes. Secondly, using Domain-Ontology structure as a clue, automatically generate example extraction patterns. Finally, acquire Ontology examples taking advantage of the special structure feature of the Web pages. Experiments are carried out in the field of film, the precision of the Ontology structure extraction is 83.7%, and the highest recall of the examples extraction reaches 90%. Experimental results demonstrate that the method developed in this paper is fairly efficient.