Domain Ontology Learning from Websites

This paper proposes a novel method to learn light-weight domain ontology from the Web. To create ontology (semi-) automatically has been a challenging and critical problem to make Semantic Web come true. Many methods have been reported on ontology learning from the Web by analyzing the page contents. However, they are not applicable for learning ontology from organization websites, where the description of a concept or an individual is distributed across multiple web pages, and the ontological information can only be discovered by considering website structure. We propose a website structure based method to extract organizational ontology from organization websites. Multiple organizational ontology of the same domain can be merged into domain ontology. This method employs both the intra-page and inter-page hierarchical relations hidden in website structure for ontology learning. The empirical experiments show the effectiveness of this approach.