Treatment and Research of Massive Data Mining Based on Cloud Computing

This paper introduces SPRINT algorithm optimized in the Hadoop core framework. Combing the data mining process, we will study the cloud computing in the MapReduce programming model, then improve and optimize the SPRINT algorithm in conjunction with the mode, transplant the optimized algorithm to Hadoop platform for distributed data processing.

[1]  Naohiro Ishii,et al.  Rough Set Based Learning for Classification , 2008, 2008 20th IEEE International Conference on Tools with Artificial Intelligence.

[2]  Nitesh V. Chawla,et al.  Scaling up Classifiers to Cloud Computers , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[3]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.