Web Mining Based on Improved MapReduce Model

When process the massive data,there exists a calculation bottleneck in current Web mining system based on single server.To solve these problems,a cloud-computing technology-based Web mining method is proposed. That is,the large data and mining tasks will be decomposed on multiple computers and be processed by parallel. Open source project - Hadoop to establish a parallel Web mining platform is used.Moreover,a kind of improved MapReduce model - MapReduce-LP is put forward.It has been verified the effectiveness of system and efficiency of new model by Web log mining job in Electronic Commerce Systems.Experimental results show that,using cloud-computing technology to process large data in the cluster can significantly improve the efficiency of Web mining.