Large-Scale Parallel Data Clustering

Algorithmic enhancements are described that enable large computational reduction in mean square-error data clustering. These improvements are incorporated into a parallel data-clustering tool, P-CLUSTER, designed to execute on a network of workstations. Experiments involving the unsupervised segmentation of standard texture images were performed. For some data sets, a 96 percent reduction in computation was achieved.

[1]  David Haussler,et al.  Mining scientific data , 1996, CACM.

[2]  Hadar I. Avi-Itzhak,et al.  Lossless acceleration for correlation-based nearest-neighbor pattern recognition , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[3]  E. Forgy,et al.  Cluster analysis of multivariate data : efficiency versus interpretability of classifications , 1965 .

[4]  Kai Hwang,et al.  Parallel Pattern ClusterIng on a Multiprocessor with Orthogonally Shared Memory , 1987, ICPP.

[5]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[6]  Kuldip K. Paliwal,et al.  Fast K-dimensional tree algorithms for nearest neighbor search with application to vector quantization encoding , 1992, IEEE Trans. Signal Process..

[7]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[8]  Robert DeLine,et al.  SUIT: the Pascal of user interface toolkits , 1991, UIST '91.

[9]  H. P. Friedman,et al.  On Some Invariant Criteria for Grouping Data , 1967 .

[10]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[11]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Computing k-Nearest Neighbors , 1975, IEEE Transactions on Computers.

[12]  Vaidy S. Sunderam,et al.  PVM: A Framework for Parallel Distributed Computing , 1990, Concurr. Pract. Exp..

[13]  Anil K. Jain,et al.  Parallel implementation of vision algorithms on workstation clusters , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 2 - Conference B: Computer Vision & Image Processing. (Cat. No.94CH3440-5).

[14]  Ronald A. Cole,et al.  Spoken Letter Recognition , 1990, HLT.

[15]  Phil Brodatz,et al.  Textures: A Photographic Album for Artists and Designers , 1966 .

[16]  Sartaj Sahni,et al.  Clustering on a Hypercube Multicomputer , 1991, IEEE Trans. Parallel Distributed Syst..

[17]  Anil K. Jain,et al.  Unsupervised texture segmentation using Gabor filters , 1990, 1990 IEEE International Conference on Systems, Man, and Cybernetics Conference Proceedings.

[18]  Anil K. Jain,et al.  A VLSI Systolic Architecture for Pattern Clustering , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.