Constrained Spectral Clustering Network with Self-Training

Deep spectral clustering networks have shown their superiorities due to the integration of feature learning and cluster assignment, and the ability to deal with non-convex clusters. Nevertheless, deep spectral clustering is still an ill-posed problem. Specifically, the affinity learned by the most remarkable SpectralNet is not guaranteed to be consistent with local invariance and thus hurts the final clustering performance. In this paper, we propose a novel framework of Constrained Spectral Clustering Network (CSCN) by incorporating pairwise constraints and clustering oriented fine-tuning to deal with the ill-posedness. To the best of our knowledge, this is the first constrained deep spectral clustering method. Another advantage of CSCN over existing constrained deep clustering networks is that it propagates pairwise constraints throughout the entire dataset. In addition, we design a clustering oriented loss by self-training to simultaneously finetune feature representations and perform cluster assignments, which further improve the quality of clustering. Extensive experiments on benchmark datasets demonstrate that our approach outperforms the state-of-the-art clustering methods.

[1]  Ian Davidson,et al.  A Framework for Deep Constrained Clustering - Algorithms and Advances , 2019, ECML/PKDD.

[2]  Ronen Basri,et al.  SpectralNet: Spectral Clustering using Deep Neural Networks , 2018, ICLR.

[3]  Ian Davidson,et al.  Flexible constrained spectral clustering , 2010, KDD.

[4]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[5]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[6]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[7]  Arindam Banerjee,et al.  Active Semi-Supervision for Pairwise Constrained Clustering , 2004, SDM.

[8]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[9]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[10]  Jianping Yin,et al.  Improved Deep Embedded Clustering with Local Structure Preservation , 2017, IJCAI.

[11]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[12]  Zhiwu Lu,et al.  Constrained Spectral Clustering via Exhaustive and Efficient Constraint Propagation , 2010, ECCV.

[13]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[14]  Ali Farhadi,et al.  Unsupervised Deep Embedding for Clustering Analysis , 2015, ICML.

[15]  Daniel Cohen-Or,et al.  Clustering-Driven Deep Embedding With Pairwise Constraints , 2018, IEEE Computer Graphics and Applications.

[16]  Anil K. Jain Data clustering: 50 years beyond K-means , 2008, Pattern Recognit. Lett..

[17]  Zsolt Kira,et al.  Neural network-based clustering using pairwise constraints , 2015, ArXiv.

[18]  Enhong Chen,et al.  Learning Deep Representations for Graph Clustering , 2014, AAAI.

[19]  Huachun Tan,et al.  Variational Deep Embedding: An Unsupervised and Generative Approach to Clustering , 2016, IJCAI.

[20]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[21]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[24]  Jane You,et al.  Semi-Supervised Ensemble Clustering Based on Selected Constraint Projection , 2018, IEEE Transactions on Knowledge and Data Engineering.

[25]  Zenglin Xu,et al.  Semi-supervised deep embedded clustering , 2019, Neurocomputing.

[26]  Rayid Ghani,et al.  Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.