An ontology-based semantic extraction approach for B2C ecommerce

Although varieties of investigations have been done on human semantic interactions with Web resources, no advanced and considerable progresses have been achieved. It could be said that comparative shopping systems are the last generations of B2C eCommerce systems that connect to multiple online stores and collect the information requested by the user. In some cases, the information is extracted from the online store sites through keyword search and other means of textual analysis. These processes make use of assumptions about the proximity of certain pieces of information. These heuristic approaches are error-prone and are not always guaranteed to work. In this paper, we propose an ontology-based approach to extract the products’ information and the vendors’ price from their public Web sites’ pages. Although most vendors on the Web present their products’ information in HTML documents that are not semantic formats. However, our approach is based on understanding semantics of HTML documents and extracting the information automatically.

[1]  Tobias Dönz Extracting Structured Data from Web Pages , 2003 .

[2]  Calton Pu,et al.  XWRAP: an XML-enabled wrapper construction system for Web information sources , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[3]  Young-In Song,et al.  A Practical QA System in Restricted Domains , 2004 .

[4]  Sophie Cluet,et al.  Your mediators need data conversion! , 1998, SIGMOD '98.

[5]  Valter Crescenzi,et al.  RoadRunner: Towards Automatic Data Extraction from Large Web Sites , 2001, VLDB.

[6]  Andrew Whinston,et al.  Electronic Commerce: A Manager's Guide , 1997 .

[7]  Nicholas Kushmerick,et al.  Wrapper induction: Efficiency and expressiveness , 2000, Artif. Intell..

[8]  Arnaud Sahuguet,et al.  Looking at the Web through XML glasses , 1999, Proceedings Fourth IFCIS International Conference on Cooperative Information Systems. CoopIS 99 (Cat. No.PR00384).

[9]  Berthier A. Ribeiro-Neto,et al.  An Example-Based Environment for Wrapper Generation , 2000, ER.

[10]  Nicholas Kushmerick,et al.  Wrapper Induction for Information Extraction , 1997, IJCAI.

[11]  Steffen Staab,et al.  Ontology-Based Query and Answering in Chemistry: OntoNova @ Project Halo , 2003, SEMWEB.

[12]  Hector Garcia-Molina,et al.  Extracting Semistructured Information from the Web. , 1997 .

[13]  Peter Clark,et al.  A Knowledge-Based Approach to Question-Answering , 1999 .

[14]  Jaeyoung Yang,et al.  Knowledge-based Wrapper Generation by Using XML , 2001 .