Deep Clustering With Variational Autoencoder

An autoencoder that learns a latent space in an unsupervised manner has many applications in signal processing. However, the latent space of an autoencoder does not pursue the same clustering goal as Kmeans or GMM. A recent work proposes to artificially re-align each point in the latent space of an autoencoder to its nearest class neighbors during training (Song et al. 2013). The resulting new latent space is found to be much more suitable for clustering, since clustering information is used. Inspired by previous works (Song et al. 2013), in this letter we propose several extensions to this technique. First, we propose a probabilistic approach to generalize Song's approach, such that Euclidean distance in the latent space is now represented by KL divergence. Second, as a consequence of this generalization we can now use probability distributions as inputs rather than points in the latent space. Third, we propose using Bayesian Gaussian mixture model for clustering in the latent space. We demonstrated our proposed method on digit recognition datasets, MNIST, USPS and SHVN as well as scene datasets, Scene15 and MIT67 with interesting findings.

[1]  Xudong Jiang,et al.  Linear Subspace Learning-Based Dimensionality Reduction , 2011, IEEE Signal Processing Magazine.

[2]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[3]  Han Wang,et al.  MAP approximation to the variational Bayes Gaussian mixture model and application , 2018, Soft Comput..

[4]  Naixue Xiong,et al.  Learning Sparse Representation With Variational Auto-Encoder for Anomaly Detection , 2018, IEEE Access.

[5]  Vishal M. Patel,et al.  Deep Sparse Representation-Based Classification , 2019, IEEE Signal Processing Letters.

[6]  Ali Farhadi,et al.  Unsupervised Deep Embedding for Clustering Analysis , 2015, ICML.

[7]  Bo Zhang,et al.  Discriminatively Boosted Image Clustering with Fully Convolutional Auto-Encoders , 2017, Pattern Recognit..

[8]  Diederik P. Kingma,et al.  Stochastic Gradient VB and the Variational Auto-Encoder , 2013 .

[9]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[10]  Jiawei Han,et al.  Document clustering using locality preserving indexing , 2005, IEEE Transactions on Knowledge and Data Engineering.

[11]  Shuigeng Zhou,et al.  DeepCluster: A General Clustering Framework Based on Deep Learning , 2017, ECML/PKDD.

[12]  Tong Zhang,et al.  Deep Subspace Clustering Networks , 2017, NIPS.

[13]  Ali Taylan Cemgil,et al.  Audio Source Separation Using Variational Autoencoders and Weak Class Supervision , 2019, IEEE Signal Processing Letters.

[14]  Wei-Yun Yau,et al.  Structured AutoEncoders for Subspace Clustering , 2018, IEEE Transactions on Image Processing.

[15]  Jacek M. Zurada,et al.  Deep Learning of Constrained Autoencoders for Enhanced Understanding of Data , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[16]  Hirokazu Kameoka,et al.  Joint Separation and Dereverberation of Reverberant Mixtures with Multichannel Variational Autoencoder , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Shiming Xiang,et al.  Self-Paced AutoEncoder , 2018, IEEE Signal Processing Letters.

[18]  Bo Yang,et al.  Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering , 2016, ICML.

[19]  Rodrigo G. F. Soares Effort Estimation via Text Classification And Autoencoders , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[20]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[21]  Chao Wang,et al.  Improving Emotion Classification through Variational Inference of Latent Variables , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Huachun Tan,et al.  Variational Deep Embedding: An Unsupervised and Generative Approach to Clustering , 2016, IJCAI.

[25]  Ming-Hsuan Yang,et al.  Subspace Clustering via Good Neighbors , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Radu Horaud,et al.  Speech Enhancement with Variational Autoencoders and Alpha-stable Distributions , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[27]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[28]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[29]  Feng Liu,et al.  Auto-encoder Based Data Clustering , 2013, CIARP.

[30]  Xudong Jiang,et al.  Asymmetric Principal Component and Discriminant Analyses for Pattern Classification , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Björn W. Schuller,et al.  Universum Autoencoder-Based Domain Adaptation for Speech Emotion Recognition , 2017, IEEE Signal Processing Letters.

[32]  Yangyu Fan,et al.  Automatic Modulation Classification Using Deep Learning Based on Sparse Autoencoders With Nonnegativity Constraints , 2017, IEEE Signal Processing Letters.

[33]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[34]  Kou Tanaka,et al.  ACVAE-VC: Non-Parallel Voice Conversion With Auxiliary Classifier Variational Autoencoder , 2019, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[35]  Ying Tan,et al.  Semisupervised Text Classification by Variational Autoencoder , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[36]  John H. L. Hansen,et al.  Language/Dialect Recognition Based on Unsupervised Deep Learning , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[37]  Junbin Gao,et al.  Nonlinear Subspace Clustering via Adaptive Graph Regularized Autoencoder , 2019, IEEE Access.

[38]  Hui Feng,et al.  Efficient Compressed Sensing for Wireless Neural Recording: A Deep Learning Approach , 2017, IEEE Signal Processing Letters.

[39]  Shahrokh Ghaemmaghami,et al.  A New Framework to Train Autoencoders Through Non-Smooth Regularization , 2019, IEEE Transactions on Signal Processing.