Web Mining - The Ontology Approach

The World Wide Web today provides users access to extremely large number of Web sites many of which contain information of education and commercial values. Due to the unstructured and semi-structured nature of Web pages and the design idiosyncrasy of Web sites, it is a challenging task to develop digital libraries for organizing and managing digital content from the Web. Web mining research, in its last 10 years, has on the other hand made significant progress in categorizing and extracting content from the Web. In this paper, we represent ontology as a set of concepts and their inter-relationships relevant to some knowledge domain. The knowledge provided by ontology is extremely useful in defining the structure and scope for mining Web content. We will therefore review Web mining and describe the ontology approach to Web mining. The application of these Web mining techniques to digital library systems will also be discussed.

[1]  Ee-Peng Lim,et al.  Web unit-based mining of homepage relationships , 2006, J. Assoc. Inf. Sci. Technol..

[2]  Ee-Peng Lim,et al.  Performance measurement framework for hierarchical text classification , 2003, J. Assoc. Inf. Sci. Technol..

[3]  Boris Motik,et al.  Managing multiple and distributed ontologies on the Semantic Web , 2003, The VLDB Journal.

[4]  Ee-Peng Lim,et al.  Core: A Search and Browsing Tool for Semantic Instances of Web Sites , 2005, APWeb.

[5]  Berthier A. Ribeiro-Neto,et al.  A brief survey of web data extraction tools , 2002, SGMD.

[6]  Michael R. Genesereth,et al.  Logical foundations of artificial intelligence , 1987 .

[7]  Ee-Peng Lim,et al.  Web unit mining: finding and classifying subgraphs of web pages , 2003, CIKM '03.

[8]  York Sure,et al.  Ontoedit : Collaborative ontology engineering for the semantic web , 2002 .

[9]  K. Minton Extraction Patterns for Information Extraction Tasks : A Survey , 1999 .

[10]  Aldo Gangemi,et al.  Ontology Learning and Its Application to Automated Terminology Translation , 2003, IEEE Intell. Syst..

[11]  Dell Zhang,et al.  Question classification using support vector machines , 2003, SIGIR.

[12]  Xiaomeng Su,et al.  A Comparative Study of Ontology Languages and Tools , 2002, CAiSE.

[13]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[14]  Dieter Fensel,et al.  Towards the Semantic Web: Ontology-driven Knowledge Management , 2002 .

[15]  Rick Bennett,et al.  Trends in the Evolution of the Public Web: 1998 - 2002 , 2003, D Lib Mag..

[16]  Ion Muslea,et al.  Extraction Patterns for Information Extraction Tasks: A Survey , 1999 .

[17]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[18]  Ee-Peng Lim,et al.  Web classification using support vector machine , 2002, WIDM '02.