Web Usage Mining and the Challenge of Big Data: A Review of Emerging Tools and Techniques

The web is a rich data mining source which is dynamic and fast growing, providing great opportunities which are often not exploited. Web data represent a real challenge to traditional data mining techniques due to its huge amount and the unstructured nature. Web logs contain information about the interactions between visitors and the website. Analyzing these logs provides insights into visitors’ behavior, usage patterns, and trends. Web usage mining, also known as web log mining, is the process of applying data mining techniques to discover useful information hidden in web server’s logs. Web logs are primarily used by Web administrators to know how much traffic they get and to detect broken links and other types of errors. Web usage mining extracts useful information that can be beneficial to a number of application areas such as: web personalization, website restructuring, system performance improvement, and business intelligence. The Web usage mining process involves three main phases: pre-processing, pattern discovery, and pattern analysis. Various preprocessing techniques have been proposed to extract information from log files and group primitive data items into meaningful, lighter level abstractions that are suitable for mining, usually in forms of visitors’ sessions. Major data mining techniques in web usage mining pattern discovery are: clustering, association analysis, classification, and sequential patterns discovery. This chapter discusses the process of web usage mining, its procedure, methods, and patterns discovery techniques. The chapter also presents a practical example using real web log data. Web Usage Mining and the Challenge of Big Data: A Review of Emerging Tools and Techniques

[1]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[2]  Sheetal A. Raiyani,et al.  Efficient Preprocessing technique using Web log mining , 2012 .

[3]  V. M. Wadhai,et al.  Constraint-based Web Log Mining for Analyzing Customers' Behaviour , 2010 .

[4]  Dhinaharan Nagamalai,et al.  Analysis of Web Logs and Web User in Web Mining , 2011, ArXiv.

[5]  Tanbhir Hoq,et al.  Micro hydro power: promising solution for off-grid renewable energy source , 2011 .

[6]  V. Chitraa,et al.  A Survey on Preprocessing Methods for Web Usage Data , 2010, ArXiv.

[7]  Tamanna Bhatia Link Analysis Algorithms For Web Mining , 2011 .

[8]  Kirit J. Modi,et al.  Web Personalization Using Web Mining: Concept and Research Issue , 2012 .

[9]  Darshak B. Mehta,et al.  Web Usage Mining to Discover Visitor Group with Common Behavior Using DBSCAN Clustering Algorithm , 2013 .

[10]  Chhavi Rana,et al.  A Study of Web Usage Mining Research Tools , 2012 .

[11]  Maja Dimitrijevic,et al.  Association rules for improving website effectiveness: Case analysis , 2013 .

[12]  Preeti Sharma,et al.  An Approach for Customer Behavior Analysis Using Web Mining , 2011 .

[13]  Dolley Shukla,et al.  Region Filter and Optical Flow based Video Surveillance System , 2013 .

[14]  Khaleel Ahmad ANALYSIS OF WEB MINING APPLICATIONS AND BENEFICIAL AREAS , 2011 .

[15]  Sanjay Silakari,et al.  Web Personalization Systems and Web Usage Mining: A Review , 2013 .

[16]  Zhiguo Gong,et al.  Web structure mining: an introduction , 2005, 2005 IEEE International Conference on Information Acquisition.