SIHC: A Stable Incremental Hierarchical Clustering Algorithm

SAHN is a widely used agglomerative hierarchical clustering method. Nevertheless it is not an incremental algorithm and therefore it is not suitable for many real application areas where all data is not available at the beginning of the process. Some authors proposed incremental variants of SAHN. Their goal was to obtain the same results in incremental environments. This approach is not practical since frequently must rebuild the hierarchy, or a big part of it, and often leads to completely different structures. We propose a novel algorithm, called SIHC, that updates SAHN hierarchies with minor changes in the previous structures. This property makes it suitable for real environments. Results on 11 synthetic and 6 real datasets show that SIHC builds high quality clustering hierarchies. This quality level is similar and sometimes better than SAHN’s. Moreover, the computational complexity of SIHC is lower than SAHN’s.