Data clustering analysis with ant-based template mechanism

Data clustering algorithms play an important role in effective analysis and organization of massive amount of information. In recent years, biologically-inspired data clustering algorithms have drawn much attention in the research field of data mining because of simplicity of such algorithms and substantial efficiency thus achieved. Clustering algorithms with ant-based template mechanism (Ant_TM) are promising to support data analysis of large datasets due to the self-organization mechanism, but suffer from slow convergence rate and the inability to separate different classes of data items, resulting in non-optimal clustering results. Many researchers proposed different refinement measures to mitigate these problems, making the algorithms more and more complex and losing the merit of being simple. In our work, we focus on two aspects. Firstly, we identified and tackled existing problems of Ant_TM and propose simple yet efficient solutions to improve performance of existing algorithms by applying a novel splitting rule and hybridizing Ant_TM with the K-means algorithm. Secondly, we further enhance Ant_TM so that it can be applied to automatic determination of the optimal cluster number of both spherical datasets and arbitrarily-shaped datasets. The analytical and empirical results show that our newly proposed algorithms can largely improve clustering performance.