Using Ontologies for Extracting Product Features from Web Pages

In this paper, we show how to use ontologies to bootstrap a knowledge acquisition process that extracts product information from tabular data on Web pages. Furthermore, we use logical rules to reason about product specific properties and to derive higher-order knowledge about product features. We will also explain the knowledge acquisition process, covering both ontological and procedural aspects. Finally, we will give an qualitative and quantitative evaluation of our results.

[1]  Georg Gottlob,et al.  The Lixto Project: Exploring New Frontiers of Web Data Extraction , 2006, BNCOD.

[2]  Marcus Herzog,et al.  Visually guided bottom-up table detection and segmentation in web documents , 2006, WWW '06.

[3]  David E. Millard,et al.  Automatic Ontology-Based Knowledge Extraction from Web Documents , 2003, IEEE Intell. Syst..

[4]  Kunal Patel,et al.  Semantic Processing of the Semantic Web , 2003, SEMWEB.

[5]  Toru Ishida,et al.  Ontology extraction from tables on the Web , 2006, International Symposium on Applications and the Internet (SAINT'06).

[6]  Matthew Hurst,et al.  Layout and Language: Challenges for Table Understanding on the Web , 2001 .

[7]  Kaustubh Supekar,et al.  OntoGenie: Extracting Ontology Instances from WWW , 2003 .

[8]  Cui Tao,et al.  Automatically Extracting Ontologically Specified Data from HTML Tables of Unknown Structure , 2002, ER.

[9]  Xinxin Wang,et al.  Tabular Abstraction, Editing, and Formatting , 1996 .

[10]  David W. Embley,et al.  Ontology generation from tables , 2003, Proceedings of the Fourth International Conference on Web Information Systems Engineering, 2003. WISE 2003..

[11]  P. Schönemann On artificial intelligence , 1985, Behavioral and Brain Sciences.

[12]  David W. Embley,et al.  A Generalized Framework for an Ontology-Based Data-Extraction System , 2005, ISTA.

[13]  Wolfgang Gatterbauer,et al.  Table Extraction Using Spatial Reasoning on the CSS2 Visual Box Model , 2006, AAAI.

[14]  John Mylopoulos,et al.  The Semantic Web - ISWC 2003 , 2003, Lecture Notes in Computer Science.

[15]  Stefano Spaccapietra,et al.  Conceptual Modeling — ER 2002 , 2002, Lecture Notes in Computer Science.

[16]  David W. Embley,et al.  Notes on Contemporary Table Recognition , 2006, Document Analysis Systems.