Fuzzy clustering of large satellite images using high performance computing

Fuzzy clustering is one of the most frequently used methods for identifying homogeneous regions in remote sensing images. In the case of large images, the computational costs of fuzzy clustering can be prohibitive unless high performance computing is used. Therefore, efficient parallel implementations are highly desirable. This paper presents results on the efficiency of a parallelization strategy for the Fuzzy c-Means (FCM) algorithm. In addition, the parallelization strategy has been extended in the case of two FCM variants, which incorporates spatial information (Spatial FCM and Gaussian Kernel-based FCM with spatial bias correction). The high-level requirements that guided the formulation of the proposed parallel implementations are: (i) find appropriate partitioning of large images in order to ensure a balanced load of processors; (ii) use as much as possible the collective computations; (iii) reduce the cost of communications between processors. The parallel implementations were tested through several test cases including multispectral images and images having a large number of pixels. The experiments were conducted on both a computational cluster and a BlueGene/P supercomputer with up to 1024 processors. Generally, good scalability was obtained both with respect to the number of clusters and the number of spectral bands.

[1]  Aly A. Farag,et al.  A modified fuzzy c-means algorithm for bias field estimation and segmentation of MRI data , 2002, IEEE Transactions on Medical Imaging.

[2]  Rajesh N. Davé,et al.  Validating fuzzy partitions obtained through c-shells clustering , 1996, Pattern Recognit. Lett..

[3]  K. Thangavel,et al.  An Intuitionistic Fuzzy Approach to Distributed Fuzzy Clustering , 2010 .

[4]  Miin-Shen Yang,et al.  A Gaussian kernel-based fuzzy c-means algorithm with a spatial bias correction , 2008, Pattern Recognit. Lett..

[5]  José Luis Martín,et al.  Implementation of a modified Fuzzy C-Means clustering algorithm for real-time applications , 2005, Microprocess. Microsystems.

[6]  Lutgarde M. C. Buydens,et al.  Clustering multispectral images: a tutorial , 2005 .

[7]  Weina Wang,et al.  On fuzzy cluster validity indices , 2007, Fuzzy Sets Syst..

[8]  Tzong-Jer Chen,et al.  Fuzzy c-means clustering with spatial information for image segmentation , 2006, Comput. Medical Imaging Graph..

[9]  Valentin Cristea,et al.  A Distributed Algorithm for Multispectral Image Segmentation , 2007, Ninth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2007).

[10]  Ujjwal Maulik,et al.  Validity index for crisp and fuzzy clusters , 2004, Pattern Recognit..

[11]  Sebastián Lozano,et al.  Parallel Fuzzy c-Means Clustering for Large Data Sets , 2002, Euro-Par.

[12]  Soon-H. Kwon Cluster validity index for fuzzy clustering , 1998 .

[13]  Myrian C. A. Costa,et al.  Parallel Fuzzy c-Means Cluster Analysis , 2006, VECPAR.

[14]  Qian Du,et al.  High performance computing for hyperspectral image analysis: Perspective and state-of-the-art , 2009, 2009 IEEE International Geoscience and Remote Sensing Symposium.

[15]  Inderjit S. Dhillon,et al.  A Data-Clustering Algorithm on Distributed Memory Multiprocessors , 1999, Large-Scale Parallel Data Mining.

[16]  Alan Wee-Chung Liew,et al.  Fuzzy image clustering incorporating spatial continuity , 2000 .

[17]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.