Fast agglomerative clustering using information of k-nearest neighbors

In this paper, we develop a method to lower the computational complexity of pairwise nearest neighbor (PNN) algorithm. Our approach determines a set of candidate clusters being updated after each cluster merge. If the updating process is required for some of these clusters, k-nearest neighbors are found for them. The number of distance calculations for our method is O(N^2), where N is the number of data points. To further reduce the computational complexity of the proposed algorithm, some available fast search approaches are used. Compared to available approaches, our proposed algorithm can reduce the computing time and number of distance calculations significantly. Compared to FPNN, our method can reduce the computing time by a factor of about 26.8 for the data set from a real image. Compared with PMLFPNN, our approach can reduce the computing time by a factor of about 3.8 for the same data set.

[1]  Daben Liu,et al.  Online speaker clustering , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[2]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Yi-Ching Liaw,et al.  Fast-searching algorithm for vector quantization using projection and triangular inequality , 2004, IEEE Transactions on Image Processing.

[4]  Philip Ogunbona,et al.  On the computational complexity of the LBG and PNN algorithms , 1997, IEEE Trans. Image Process..

[5]  Pasi Fränti,et al.  Fast and memory efficient implementation of the exact PNN , 2000, IEEE Trans. Image Process..

[6]  Robert M. Gray,et al.  Finite-state vector quantization for waveform coding , 1985, IEEE Trans. Inf. Theory.

[7]  Yi-Ching Liaw,et al.  Fast k-nearest-neighbor search based on projection and triangular inequality , 2007, Pattern Recognit..

[8]  Yi-Ching Liaw Improvement of the fast exact pairwise-nearest-neighbor algorithm , 2009, Pattern Recognit..

[9]  Dustin Luis Espinosa-Isidrón,et al.  An incremental nested partition method for data clustering , 2010, Pattern Recognit..

[10]  Jim Z. C. Lai,et al.  Fast Search Algorithms for VQ Codebook Generation , 1996, J. Vis. Commun. Image Represent..

[11]  Mohamed S. Kamel,et al.  Cooperative clustering , 2010, Pattern Recognit..

[12]  Takio Kurita,et al.  An efficient agglomerative clustering algorithm using a heap , 1991, Pattern Recognit..

[13]  B. Jaumard,et al.  Efficient algorithms for divisive hierarchical clustering with the diameter criterion , 1990 .

[14]  Yi-Ching Liaw,et al.  Artifact reduction of JPEG coded images using mean-removed classified vector quantization , 2002, Signal Process..

[15]  Olli Nevalainen,et al.  Vector Quantizationby Lazy Pairwise Nearest Neighbor Method , 1998 .

[16]  James McNames,et al.  A Fast Nearest-Neighbor Algorithm Based on a Principal Axis Search Tree , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  N. de Freitas,et al.  On-line probabilistic classification with particle filters , 2000, Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No.00TH8501).

[18]  Mohamed S. Kamel,et al.  On voting-based consensus of cluster ensembles , 2010, Pattern Recognit..

[19]  Jim Z. C. Lai,et al.  Inverse error-diffusion using classified vector quantization , 1998, IEEE Trans. Image Process..

[20]  William Equitz,et al.  A new vector quantization clustering algorithm , 1989, IEEE Trans. Acoust. Speech Signal Process..

[21]  Pasi Fränti,et al.  Fast Agglomerative Clustering Using a k-Nearest Neighbor Graph , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Timo Kaukoranta,et al.  Practical methods for speeding-up the pairwise nearest neighbor method , 2001 .

[23]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[24]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[25]  S. Ra,et al.  A fast mean-distance-ordered partial codebook search algorithm for image vector quantization , 1993 .

[26]  MAGDALINI EIRINAKI,et al.  Web mining for web personalization , 2003, TOIT.

[27]  Jim Z. C. Lai,et al.  Image restoration of compressed image using classified vector quantization , 2002, Pattern Recognit..