论文信息 - KNNCC: An algorithm for k-nearest neighbor clique clustering

KNNCC: An algorithm for k-nearest neighbor clique clustering

K-nearest neighbor algorithm is the most widely used classification and clustering algorithm. It is simple, fast, straight and effective. However, the relationship between the nearest items is a partial order. Since it is not a strong conjunction, items could be clustered by force. To that end, in this paper, we propose a concept of k-nearest neighbor clique based on k-nearest neighbors and reversed k-nearest neighbors. First, by measuring the similarity between items, we select the items that form the pairs of mutually k-nearest neighbor and reversed k-nearest neighbor. These items are used to construct k-nearest neighbor cliques. Since the relationship between items in the same clique is a total order, they have a high similarity to each other. Then, we use the cliques as new data to seed clustering in the next round. This process is repeated until some conditions are satisfied. Finally, the experiments on the real-world datasets validate the effectiveness of our proposed algorithm.

Xiaorui Wei | Ruifen Yuan | Chao Qu

[1] Peter E. Hart,et al. Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[2] Hu Tian-ming. Text Clustering Method Based on Word Hyperclique , 2011 .

[3] Kazuhisa Makino,et al. New Algorithms for Enumerating All Maximal Cliques , 2004, SWAT.

[4] Qu Chao. Class Based on Bit Storage for 0-1 Matrix , 2010 .

[5] Sunil Arya,et al. An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[6] Richard M. Karp,et al. Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[7] Elke Achtert,et al. Efficient reverse k-nearest neighbor search in arbitrary metric spaces , 2006, SIGMOD Conference.