SEMI-SUPERVISED CLUSTERING BASED ON SIGNED TOTAL VARIATION

We consider the problem of semi-supervised clustering on signed graphs that model similarity and dissimilarity relations between nodes. We introduce a signed version of total variation and use it to formulate a convex optimization problem for the cluster labels. This optimization problem includes a 1-norm regularization to cover cases where only few cluster labels are known. We propose an ADMM-based algorithm to solve the optimization problem. The complexity of this algorithm scales linearly with the number of edges of the graph. Our scheme is suitable for distributed implementation and can therefore efficiently handle large-dimensional applications. Numerical experiments confirm that our clustering scheme is superior to existing methods.

[1]  Matthias Hein,et al.  Spectral clustering based on the graph p-Laplacian , 2009, ICML '09.

[2]  Wei Liu,et al.  Robust and Scalable Graph-Based Semisupervised Learning , 2012, Proceedings of the IEEE.

[3]  Antonin Chambolle,et al.  Diagonal preconditioning for first order primal-dual algorithms in convex optimization , 2011, 2011 International Conference on Computer Vision.

[4]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[5]  Yunzhang Zhu An Augmented ADMM Algorithm With Application to the Generalized Lasso Problem , 2017 .

[6]  Sahin Albayrak,et al.  Spectral Analysis of Signed Graphs for Clustering, Prediction and Visualization , 2010, SDM.

[7]  Ian Davidson,et al.  Flexible constrained spectral clustering , 2010, KDD.

[8]  Ian Davidson,et al.  On constrained spectral clustering and its applications , 2012, Data Mining and Knowledge Discovery.

[9]  Stephen J. Wright,et al.  Dissimilarity in Graph-Based Semi-Supervised Classification , 2007, AISTATS.

[10]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[11]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[12]  Guy Gilboa,et al.  Nonlocal Operators with Applications to Image Processing , 2008, Multiscale Model. Simul..

[13]  Xavier Bresson,et al.  Multiclass Total Variation Clustering , 2013, NIPS.

[14]  Mikhail Belkin,et al.  Beyond the point cloud: from transductive to semi-supervised learning , 2005, ICML.

[15]  Gerald Matz,et al.  Graph Signal Recovery via Primal-Dual Algorithms for Total Variation Minimization , 2017, IEEE Journal of Selected Topics in Signal Processing.

[16]  Konstantin Avrachenkov,et al.  Semi-supervised learning with regularized Laplacian , 2015, Optim. Methods Softw..

[17]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[18]  Yuji Matsumoto,et al.  Using the Mutual k-Nearest Neighbor Graphs for Semi-supervised Classification on Natural Language Data , 2011, CoNLL.

[19]  Gerald Matz,et al.  Coordinate descent accelerations for signal recovery on scale-free graphs based on total variation minimization , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).