A Web Mining process for e-Knowledge services

The purpose of this paper is to describe a process of Web Mining in order to support specialized e-Knowledge services. Here is proposed a new reference architecture based on an orchestration of reusable building blocks, with well defined tasks and the ability to interoperate among them. The system is designed to support a decision maker in a service-oriented way, by adopting a clear separation of tasks: crawling, pre-processing, information extraction, information retrieval, text mining and presentation of results. It allows the analysis of Web information by extracting, selecting, processing and modelling huge amounts of data, in order to discover rules and patterns in a distributed and heterogeneous content environment of informative resources. Finally, as a case study, the Reputation Management process is presented.

[1]  O. Etzioni,et al.  The world-wide web : Quagmire or gold mine ? : Data mining and knowledge discovery in databases , 1996 .

[2]  Marcello Castellano,et al.  A Knowledge Center for a Social and Economic Growth of the Territory , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[3]  Marcello Castellano,et al.  A Flexible Mining Architecture for Providing New E-Knowledge Services , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[4]  Hector Garcia-Molina,et al.  The Evolution of the Web and Implications for an Incremental Crawler , 2000, VLDB.

[5]  Martin van den Berg,et al.  Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.

[6]  Oren Etzioni,et al.  The World-Wide Web: quagmire or gold mine? , 1996, CACM.

[7]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[8]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[9]  Chaomei Chen,et al.  Mining the Web: Discovering knowledge from hypertext data , 2004, J. Assoc. Inf. Sci. Technol..

[10]  Soumen Chakrabarti,et al.  Data mining for hypertext: a tutorial survey , 2000, SKDD.