In existed distributed edge extraction method based on MapReduce, the inappropriate dataset split algorithms leaded to the loss problem of image features in result. We presented a distributed computing platform called Split Process Cluster (SPC) to resolve this problem. In SPC, the images are partitioned with the resilient image pyramid model (RIP), a multi-layer and redundant data structure we presented earlier, to ensure the integrity of original image features. And SPC packages the image data to the form of Key-Value pairs, which could be processed through Hadoop, and reduces the results with density-based spatial clustering of applications with noise (DBSCAN) algorithm. Compared to traditional method, the extraction rate of image feature by using SPC has been improved, which indicates that using SPC is an efficient way to improve the extraction rate of distributed edge extraction.
[1]
Cheng Fu-cha.
Distributed edge extraction method of remote sensing image considering integrity of results
,
2014
.
[2]
Jan Kolar.
Representation of geographic terrain surface using global indexing
,
2004
.
[3]
Craig Chambers,et al.
FlumeJava: easy, efficient data-parallel pipelines
,
2010,
PLDI '10.
[4]
Michael D. Ernst,et al.
HaLoop
,
2010,
Proc. VLDB Endow..
[5]
Geoffrey H. Dutton,et al.
Encoding and Handling Geospatial Data with Hierarchical Triangular Meshes
,
1996
.
[6]
James Frew,et al.
Lineage retrieval for scientific data processing: a survey
,
2005,
CSUR.