论文信息 - Multiple K Means++ Clustering of Satellite Image Using Hadoop MapReduce and Spark

Multiple K Means++ Clustering of Satellite Image Using Hadoop MapReduce and Spark

Clustering of image is one of the important steps of mining satellite images. In our experiment we have simultaneously run multiple K-means algorithms with different initial centroids and values of k in the same iteration of MapReduce jobs. For initialization of initial centroids we have implemented Scalable K-Means++ MapReduce (MR) job [1]. We have also run a validation algorithm of Simplified Silhouette Index [2] for multiple clustering outputs, again in the same iteration of MR jobs. This paper explored the behavior of above mentioned clustering algorithms when run on big data platforms like MapReduce and Spark jobs. Spark has been chosen as it is popular for fast processing particularly where iterations are involved.

Vinod Shokeen | Tapan Sharma | Sunil Mathur

[1] Ali S. Hadi,et al. Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[2] Jeremy Freeman. Open source tools for large-scale neuroscience , 2015, Current Opinion in Neurobiology.

[3] Sergei Vassilvitskii,et al. k-means++: the advantages of careful seeding , 2007, SODA '07.

[4] Jiawei Han,et al. Data Mining: Concepts and Techniques , 2000 .

[5] Donald. Miner,et al. MapReduce design patterns , 2012 .

[6] Murilo Coelho Naldi,et al. Multiple Parallel MapReduce k-Means Clustering with Validation and Selection , 2014, 2014 Brazilian Conference on Intelligent Systems.

[7] Keqiu Li,et al. Efficient $k$ -Means++ Approximation with MapReduce , 2014, IEEE Trans. Parallel Distributed Syst..

[8] Bo Li,et al. Parallel K-Means Clustering of Remote Sensing Images Based on MapReduce , 2010, WISM.

[9] Ricardo J. G. B. Campello,et al. Relative clustering validity criteria: A comparative overview , 2010, Stat. Anal. Data Min..

[10] Qing He,et al. Parallel K-Means Clustering Based on MapReduce , 2009, CloudCom.

[11] Sergei Vassilvitskii,et al. Scalable K-Means++ , 2012, Proc. VLDB Endow..