Latent Tree Variational Autoencoder for Joint Representation Learning and Multidimensional Clustering

Recently, deep learning based clustering methods are shown superior to traditional ones by jointly conducting representation learning and clustering. These methods rely on the assumptions that the number of clusters is known, and that there is one single partition over the data and all attributes define that partition. However, in real-world applications, prior knowledge of the number of clusters is usually unavailable and there are multiple ways to partition the data based on subsets of attributes. To resolve the issues, we propose latent tree variational autoencoder (LTVAE), which simultaneously performs representation learning and multidimensional clustering. LTVAE learns latent embeddings from data, discovers multi-facet clustering structures based on subsets of latent features, and automatically determines the number of clusters in each facet. Experiments show that the proposed method achieves state-of-the-art clustering performance and reals reasonable multifacet structures of the data.

[1]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[2]  Jieping Ye,et al.  Discriminative K-means for Clustering , 2007, NIPS.

[3]  A. Raftery,et al.  Variable Selection for Model-Based Clustering , 2006 .

[4]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[5]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[6]  Huachun Tan,et al.  Variational Deep Embedding: An Unsupervised and Generative Approach to Clustering , 2016, IJCAI.

[7]  Leonard K. M. Poon,et al.  Latent Tree Analysis , 2016, AAAI.

[8]  Charu C. Aggarwal,et al.  Data Clustering: Algorithms and Applications , 2014 .

[9]  Tao Chen,et al.  Variable Selection in Model-Based Clustering: To Do or To Facilitate , 2010, ICML.

[10]  Tengfei Liu,et al.  A Survey on Latent Tree Models and Applications , 2013, J. Artif. Intell. Res..

[11]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[12]  Ali Farhadi,et al.  Unsupervised Deep Embedding for Clustering Analysis , 2015, ICML.

[13]  Eric P. Xing,et al.  Nonparametric Variational Auto-Encoders for Hierarchical Representation Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[15]  Yi Yang,et al.  Image Clustering Using Local Discriminant Models and Global Integration , 2010, IEEE Transactions on Image Processing.

[16]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[17]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[18]  Tengfei Liu,et al.  Model-based clustering of high-dimensional data: Variable selection versus facet determination , 2013, Int. J. Approx. Reason..

[19]  Dhruv Batra,et al.  Joint Unsupervised Learning of Deep Representations and Image Clusters , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[21]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[22]  Yiu-ming Cheung,et al.  A new feature selection method for Gaussian mixture clustering , 2009, Pattern Recognit..

[23]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[24]  Murray Shanahan,et al.  Deep Unsupervised Clustering with Gaussian Mixture Variational Autoencoders , 2016, ArXiv.

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26]  Nevin Lianwen Zhang,et al.  Hierarchical latent class models for cluster analysis , 2002, J. Mach. Learn. Res..

[27]  Ruslan Salakhutdinov,et al.  Importance Weighted Autoencoders , 2015, ICLR.

[28]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[30]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[31]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[32]  Arthur Gretton,et al.  A Test of Relative Similarity For Model Selection in Generative Models , 2015, ICLR.

[33]  Hui Jiang,et al.  Generating images with recurrent adversarial networks , 2016, ArXiv.

[34]  Vipin Kumar,et al.  The Challenges of Clustering High Dimensional Data , 2004 .

[35]  Tao Chen,et al.  Model-based multidimensional clustering of categorical data , 2012, Artif. Intell..