Big data analysis in e-commerce system using HadoopMapReduce

Today web mining is a challenging task in organization. Every organization generated vast amount of data from various source. Web mining is the process of extracting useful knowledge from web resources. Log files are maintained by the web server. The challenging task for E-commerce companies is to know their customer behavior to improve the business by analyzing web log files. E-commerce website can generate tens of peta bytes of data in their web log files. This paper discuss about the importance of log files in E-commerce world. The analysis of log files is used for learning the user behavior in E-commerce system. The analysis of such large web log files need parallel processing and reliable data storage system. The Hadoop framework provides reliable storage by Hadoop Distributed File System and parallel processing system for large database using MapReduce programming model. These mechanisms help to process log data in parallel manner and computes results efficiently. This approach reduces the response time as well as load on the end system. This work proposes apredictive prefetching system based on preprocessing of web logs using HadoopMapReduce, which will provide accurate results in minimum response time for E-commerce business activities.

[1]  Shashi Sahu A Survey on Frequent Web Page Mining with Improving Data Quality of Log Cleaner , 2015 .

[2]  L. Panigrahy,et al.  Web Usage Mining: A Survey on Pattern Extraction from Web Logs , 2011 .

[3]  Abha Choubey,et al.  Comparative Analysis of Apriori Algorithm and Frequent Pattern Algorithm for Frequent Pattern Mining in Web Log Data , 2012 .

[4]  R.M. Suresh,et al.  An Overview of Data Preprocessing in Data and Web Usage Mining , 2007, 2006 1st International Conference on Digital Information Management.

[5]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[6]  Nanhay Singh,et al.  Comparison Analysis of Web Usage Mining Using Pattern Recognition Techniques , 2013 .

[7]  Preeti Sharma,et al.  An Approach for Customer Behavior Analysis Using Web Mining , 2011 .

[8]  . TÜRKO,et al.  EXTRACTION OF INTERESTING PATTERNS THROUGH ASSOCIATION RULE MINING FOR IMPROVEMENT OF WEBSITE USABILITY Resul DA Ş , 2009 .

[9]  Arumugam Gurusamy,et al.  Optimal Algorithms for Generation of User Session Sequences Using Server Side Web User Logs , 2009, 2009 International Conference on Network and Service Security.

[10]  Theint Theint Aye,et al.  Web log cleaning for mining of web usage patterns , 2011, 2011 3rd International Conference on Computer Research and Development.

[11]  Darshak B. Mehta,et al.  Web Usage Mining Using Association Rule Mining on Clustered Data for Pattern Discovery , 2013 .

[12]  Shahram Jamali,et al.  Discovering users` access patterns for web usage mining from web log files , 2013 .

[13]  Debajyoti Mukhopadhyay,et al.  Analyzing web application log files to find hit count through the utilization of Hadoop MapReduce in cloud computing environment , 2014, 2014 Conference on IT in Business, Industry and Government (CSIBIG).