In order to solve the problem that there is a shortage of space and computing power of the traditional spatial data mining algorithm during the processing for massive spatial data information, a combination of Rough set and distributed framework is used in the process of spatial data mining. In this paper, parallel improvement is taken into the algorithm of the traditional Rough set for spatial data mining based on the basic theory of rough set and the Map/Reduce framework, which is efficient and cheap. Then, a spatial data example is utilized to show the feasibility of the improved parallel algorithm. Empirical results show that the improved parallel algorithm of Rough set for spatial data mining can not only effectively improve the efficiency of the algorithm but also meet the need of people to deal with massive spatial data which is hardly to the algorithm of traditional Rough set. Improved Rough set parallel algorithm for spatial data mining can effectively solve the problem of shortage for massive spatial data storage and computing power mining.
[1]
Janusz Zalewski,et al.
Rough sets: Theoretical aspects of reasoning about data
,
1996
.
[2]
Sanjay Ghemawat,et al.
MapReduce: Simplified Data Processing on Large Clusters
,
2004,
OSDI.
[3]
Renfa Li,et al.
An Incomplete Data Analysis Approach Based on the Rough Set Theory and Divide-and-Conquer Idea
,
2007,
Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007).
[4]
Chen Luo,et al.
Algorithm for Processing k-Nearest Join Based on R-Tree in MapReduce
,
2013
.
[5]
Ning Jing,et al.
Algorithm for Processing k-Nearest Join Based on R-Tree in MapReduce: Algorithm for Processing k-Nearest Join Based on R-Tree in MapReduce
,
2014
.
[6]
Hu Yu.
Application Research of Improved Attribute Reduction Algorithm in Data Mining
,
2012
.