Using Swarm intelligence for XML Clustering

Data mining in large-scale XML documents set can facilitate to query and manage XML documents. This paper proposes a novel XML document clustering method based on swarm intelligence. Firstly, the approach extracts path sequences from documents, and then the documents are transformed to vectors in a high-dimensional Euclidean space. Finally, CSX clustering method is applied to with high performance. The advantage of the approach is that swarm intelligence can help skip out of the local optima of the search space. Data sets are obtained from DBLP, and the experiment results show that the performance of the proposed techniques outperformed the standard C-means method in clustering compact and accuracy