Authority-shift clustering: Hierarchical clustering by authority seeking on graphs

In this paper, a novel hierarchical clustering method using link analysis techniques is introduced. The algorithm is formulated as an authority seeking procedure on graphs, which computes the shifts toward nodes with high authority scores. For the authority shift, we adopted the personalized PageRank score of the graph. Based on the concept of authority seeking, we achieve hierarchical clustering by iteratively propagating the authority scores to other nodes and shifting authority nodes. This scheme solves the chicken-egg difficulty in hierarchical clustering by a semiglobal bottom-up approach exploiting the global structure of the graph. The experimental evaluation demonstrates that our algorithm is more powerful compared with existing graph-based approaches in clustering and image segmentation tasks.

[1]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[2]  Takeo Kanade,et al.  Mode-seeking by Medoidshifts , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[4]  Tim Hesterberg,et al.  Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[5]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Jianbo Shi,et al.  Spectral segmentation with multiscale graph decomposition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Pavel Berkhin,et al.  A Survey on PageRank Computing , 2005, Internet Math..

[8]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[9]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[10]  Long Quan,et al.  Normalized tree partitioning for image segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[12]  Dániel Fogaras,et al.  Towards Scaling Fully Personalized PageRank: Algorithms, Lower Bounds, and Experiments , 2005, Internet Math..

[13]  Michalis Vazirgiannis,et al.  On Clustering Validation Techniques , 2001, Journal of Intelligent Information Systems.

[14]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[15]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Dale Schuurmans,et al.  Web Communities Identification from Random Walks , 2006, PKDD.

[17]  Jianbo Shi,et al.  Multiclass spectral clustering , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[18]  Horst Bischof,et al.  Saliency driven total variation segmentation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19]  Konstantin Avrachenkov,et al.  Pagerank based clustering of hypertext document collections , 2008, SIGIR '08.

[20]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[21]  William H. Press,et al.  Numerical recipes in C , 2002 .

[22]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[23]  S. Dongen Graph clustering by flow simulation , 2000 .

[24]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.