A document clustering spectral algorithm that uses evidence accumulation

Spectral clustering's weakness is an inability to choose a similarity measure.To resolve this,a document clustering spectral algorithm using evidence accumulation was proposed.In this algorithm,spherical K-means was first performed over document sets multiple times.Each time the partitioning results were regarded as evidence when judging whether two documents should be put in the same cluster or not.On this basis,the similarity matrix and normalized Laplacian matrix of the documents were constructed.Experiments on the Text REtrieval Conference(TREC) and Reuters document sets demonstrated the effectiveness of the proposed algorithm.It outperformed hierarchical clustering algorithms as well as the K-means algorithm provided in the CLUTO general purpose clustering toolkit.