An efficient agglomerative clustering algorithm using a heap

Abstract An efficient algorithm for agglomerative clustering is presented. The algorithm uses a heap in which distances of all pairs of clusters are stored. Then the nearest pair of clusters is given by the element of the root node of the binary tree corresponding to the heap. Updating the heap at each stage of the hierarchy is easily implemented by shifting up or down the elements of the heap along the path of the heap tree. The computation time of the algorithm is at most O(N2 log(N)) when N objects are going to be classified.