WEB MINING TASKS AND TYPES: A SURVEY

In recent years the growth of the World Wide Web has exceeded all expectations. Today there are several billions of HTML documents, pictures and other multimedia files available via internet and the number is still rising.As a large and dynamic information source that is structurally complex and ever growing, the World Wide Web is fertile ground for data-mining principles, or Web mining. In 1996 it’s Etzioni who first coined the term web mining. Etzioni starts by making a hypothesis that information on web is sufficiently structured and outlines the subtasks of web mining. Web mining is a very hot research topic which combines two of the activated research areas: Data Mining and World Wide Web. The Web mining research relates to several research communities such as Database, Information Retrieval and Artificial Intelligence. Web mining basically can be divided into three categories: web content mining, web structure mining and web usage mining, these three categories deal with different features of a web page, web content mining deals with discovering useful information or knowledge from web page contents, web structure mining deals with discovering and modelling the link structure of web, web usage mining is used to discover interesting usage patterns from web data. This paper is a survey paper which explains in detail the concepts of web mining focusing on tasks and types of web mining.

[1]  Lise Getoor,et al.  Link mining: a new data mining challenge , 2003, SKDD.

[2]  Donald Perlis,et al.  Information Retrieval on the World Wide Web and Active Logic: A Survey and Problem Definition , 2002 .

[3]  Hemant Kumar Singh,et al.  Web Data Mining research: A survey , 2010, 2010 IEEE International Conference on Computational Intelligence and Computing Research.

[4]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[5]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1998, SODA '98.

[6]  Oren Etzioni,et al.  The World-Wide Web: quagmire or gold mine? , 1996, CACM.

[7]  Kavita Sharma,et al.  Web mining: Today and tomorrow , 2011, 2011 3rd International Conference on Electronics Computer Technology.

[8]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[9]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[10]  Mark Levene,et al.  Mining Association Rules in Hypertext Databases , 1998, KDD.

[11]  Huang Yuan,et al.  Web mining: knowledge discovery on the Web , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[12]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[13]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .