Learning the Precise Feature for Cluster Assignment

Clustering is one of the fundamental tasks in computer vision and pattern recognition. Recently, deep clustering methods (algorithms based on deep learning) have attracted wide attention with their impressive performance. Most of these algorithms combine deep unsupervised representation learning and standard clustering together. However, the separation of representation learning and clustering will lead to suboptimal solutions because the two-stage strategy prevents representation learning from adapting to subsequent tasks (e.g., clustering according to specific cues). To overcome this issue, efforts have been made in the dynamic adaption of representation and cluster assignment, whereas current state-of-the-art methods suffer from heuristically constructed objectives with the representation and cluster assignment alternatively optimized. To further standardize the clustering problem, we audaciously formulate the objective of clustering as finding a precise feature as the cue for cluster assignment. Based on this, we propose a general-purpose deep clustering framework, which radically integrates representation learning and clustering into a single pipeline for the first time. The proposed framework exploits the powerful ability of recently developed generative models for learning intrinsic features, and imposes an entropy minimization on the distribution of the cluster assignment by a dedicated variational algorithm. The experimental results show that the performance of the proposed method is superior, or at least comparable to, the state-of-the-art methods on the handwritten digit recognition, fashion recognition, face recognition, and object recognition benchmark datasets.

[1]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[2]  Timo Aila,et al.  Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[3]  Enhong Chen,et al.  Learning Deep Representations for Graph Clustering , 2014, AAAI.

[4]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[5]  Junyu Dong,et al.  Perception-driven procedural texture generation from examples , 2018, Neurocomputing.

[6]  Andrea Tacchetti,et al.  Trading robust representations for sample complexity through self-supervised visual experience , 2018, NeurIPS.

[7]  Jingyu Wang,et al.  Detection of Small Aerial Object Using Random Projection Feature With Region Clustering , 2020, IEEE Transactions on Cybernetics.

[8]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[9]  Chen Sun,et al.  Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Ping Li,et al.  Shared Gaussian Process Latent Variable Model for Incomplete Multiview Clustering , 2020, IEEE Transactions on Cybernetics.

[11]  Léon Bottou,et al.  Wasserstein GAN , 2017, ArXiv.

[12]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[13]  Fei Wang,et al.  Deep Comprehensive Correlation Mining for Image Clustering , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Zenglin Xu,et al.  Deep Density-based Image Clustering , 2018, Knowl. Based Syst..

[15]  Pedro M. Domingos A few useful things to know about machine learning , 2012, Commun. ACM.

[16]  Xuelong Li,et al.  Spectral Clustering by Joint Spectral Embedding and Spectral Rotation , 2020, IEEE Transactions on Cybernetics.

[17]  Badong Chen,et al.  Maximum Correntropy Criterion-Based Sparse Subspace Learning for Unsupervised Feature Selection , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Léon Bottou,et al.  Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[19]  Dhruv Batra,et al.  Joint Unsupervised Learning of Deep Representations and Image Clusters , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jiashi Feng,et al.  Deep Adversarial Subspace Clustering , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Andrew Zisserman,et al.  Look, Listen and Learn , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  G. Krishna,et al.  Agglomerative clustering using the concept of mutual nearest neighbourhood , 1978, Pattern Recognit..

[23]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Ajith Abraham,et al.  A Novel Type-2 Fuzzy C-Means Clustering for Brain MR Image Segmentation , 2020, IEEE Transactions on Cybernetics.

[25]  Wei-Yun Yau,et al.  Deep Subspace Clustering with Sparsity Prior , 2016, IJCAI.

[26]  Wenan Zhou,et al.  Deep Embedded Clustering With Adversarial Distribution Adaptation , 2019, IEEE Access.

[27]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[30]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[31]  Alexei A. Efros,et al.  Generative Visual Manipulation on the Natural Image Manifold , 2016, ECCV.

[32]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Shuicheng Yan,et al.  Dual Path Networks , 2017, NIPS.

[34]  Hujun Bao,et al.  Understanding the Power of Clause Learning , 2009, IJCAI.

[35]  Yuan Yan Tang,et al.  Generalized and Discriminative Collaborative Representation for Multiclass Classification , 2020, IEEE Transactions on Cybernetics.

[36]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[37]  Graham W. Taylor,et al.  Deconvolutional networks , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38]  Feiping Nie,et al.  A Self-Balanced Min-Cut Algorithm for Image Clustering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[39]  Heng Tao Shen,et al.  Optimized Cartesian K-Means , 2014, IEEE Transactions on Knowledge and Data Engineering.

[40]  Ali Farhadi,et al.  Unsupervised Deep Embedding for Clustering Analysis , 2015, ICML.

[41]  Dehui Kong,et al.  Unsupervised Learning of Human Pose Distance Metric via Sparsity Locality Preserving Projections , 2019, IEEE Transactions on Multimedia.

[42]  Harish Bhaskar,et al.  On the Role and the Importance of Features for Background Modeling and Foreground Detection , 2016, Comput. Sci. Rev..

[43]  Paolo Favaro,et al.  Representation Learning by Learning to Count , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[44]  Andreas Krause,et al.  Discriminative Clustering by Regularized Information Maximization , 2010, NIPS.

[45]  Yong Jae Lee,et al.  Cross-Domain Self-Supervised Multi-task Feature Learning Using Synthetic Imagery , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46]  Thomas Brox,et al.  Learning to generate chairs with convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Yann LeCun,et al.  Energy-based Generative Adversarial Networks , 2016, ICLR.

[48]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[49]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[50]  Fei Yan,et al.  Robust Dimension Reduction for Clustering With Local Adaptive Learning , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[51]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[52]  Paolo Favaro,et al.  Self-Supervised Feature Learning by Learning to Spot Artifacts , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[53]  Qin Zhang,et al.  A Practical Algorithm for Distributed Clustering and Outlier Detection , 2018, NeurIPS.

[54]  Deli Zhao,et al.  Agglomerative clustering via maximum incremental path integral , 2013, Pattern Recognit..

[55]  Ronen Basri,et al.  SpectralNet: Spectral Clustering using Deep Neural Networks , 2018, ICLR.

[56]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[57]  Yuichi Yoshida,et al.  Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.

[58]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[59]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Carlos D. Castillo,et al.  Deep Density Clustering of Unconstrained Faces , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[61]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[62]  Cheng Deng,et al.  Deep Clustering via Joint Convolutional Autoencoder Embedding and Relative Entropy Minimization , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[63]  Lingfeng Wang,et al.  Deep Adaptive Image Clustering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[64]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[65]  Zhouchen Lin,et al.  Self-Supervised Convolutional Subspace Clustering Network , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[67]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[68]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[69]  Mireille Boutin,et al.  Clusterability and Clustering of Images and Other “Real” High-Dimensional Data , 2018, IEEE Transactions on Image Processing.

[70]  Justin N. Wood,et al.  The Development of Invariant Object Recognition Requires Visual Experience With Temporally Smooth Objects , 2018, Cogn. Sci..

[71]  Mert R. Sabuncu,et al.  Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels , 2018, NeurIPS.

[72]  Wen-Sheng Chu,et al.  Learning Facial Action Units from Web Images with Scalable Weakly Supervised Clustering , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[73]  Jaakko Lehtinen,et al.  Analyzing and Improving the Image Quality of StyleGAN , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[74]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[75]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.