Web Mining Overview

In the span of a decade, the World Wide Web has been transformed from a tool for information sharing among researchers into an indispensable part of everyday activities. This transformation has been characterized by an explosion of heterogeneous data and information available electronically, as well as increasingly complex applications driving a variety of systems for content management, e-commerce, e-learning, collaboration, and other Web services. This tremendous growth, in turn, has necessitated the development of more intelligent tools for end users as well as information providers in order to more effectively extract relevant information or to discover actionable knowledge. From its very beginning, the potential of extracting valuable knowledge from the Web has been quite evident. Web mining (i.e. the application of data mining techniques to extract knowledge from Web content, structure, and usage) is the collection of technologies to fulfill this potential. In this article, we will summarize briefly each of the three primary areas of Web mining—Web usage mining, Web content mining, and Web structure mining—and discuss some of the primary applications in each area.

[1]  Niall Rooney,et al.  Ensemble Learning for Regression , 2009, Encyclopedia of Data Warehousing and Mining.

[2]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[3]  Oren Etzioni,et al.  The World-Wide Web: quagmire or gold mine? , 1996, CACM.

[4]  C. Lee Giles,et al.  Efficient identification of Web communities , 2000, KDD '00.

[5]  Tom M. Mitchell,et al.  Learning to construct knowledge bases from the World Wide Web , 2000, Artif. Intell..

[6]  Dunja Mladenic,et al.  Web Mining: From Web to Semantic Web , 2004, Lecture Notes in Computer Science.

[7]  Alfredo Cuzzocrea,et al.  Intelligent Techniques for Warehousing and Mining Sensor Network Data , 2009 .

[8]  Philip Calvert,et al.  Encyclopedia of Data Warehousing and Mining , 2006 .

[9]  Jaideep Srivastava,et al.  Web mining: information and pattern discovery on the World Wide Web , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[10]  Karthikeyan Ramasamy,et al.  Data Warehousing, Multi-Dimensional Data Models, and OLAP , 2005, Encyclopedia of Database Technologies and Applications.

[11]  Hwee Tou Ng,et al.  Mining topic-specific concepts and definitions on the web , 2003, WWW '03.

[12]  Dieter Pfoser Indexing the Trajectories of Moving Objects , 2002 .

[13]  James A. M. McHugh,et al.  Mining the World Wide Web , 2001, The Information Retrieval Series.

[14]  Chew Lim Tan,et al.  A Look back at the PAKDD Data Mining Competition 2006 , 2007, Int. J. Data Warehous. Min..

[15]  Luis M. de Campos,et al.  Retrieving Medical Records Using Bayesian Networks , 2005 .

[16]  Sourav S. Bhowmick,et al.  Research Issues in Web Data Mining , 1999, DaWaK.

[17]  Andreas Hotho,et al.  Towards Semantic Web Mining , 2002, SEMWEB.

[18]  Jaideep Srivastava,et al.  Data Preparation for Mining World Wide Web Browsing Patterns , 1999, Knowledge and Information Systems.

[19]  Robert L. Grossman,et al.  Mining data records in Web pages , 2003, KDD '03.

[20]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[21]  John Wang Montclair Data Warehousing and Mining : Concepts , Methodologies , Tools , and Applications , 2008 .

[22]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[23]  José Palazzo Moreira de Oliveira,et al.  Concept-based knowledge discovery in texts extracted from the Web , 2000, SKDD.

[24]  Soumen Chakrabarti,et al.  Data mining for hypertext: a tutorial survey , 2000, SKDD.

[25]  Marcos M. Campos,et al.  Integrated Intelligence: Separating the Wheat from the Chaff in Sensor Data , 2010 .

[26]  Jenq-Foung Yao,et al.  Traversal Pattern Mining in Web Usage Data , 2005, Encyclopedia of Information Science and Technology.