Deep Amortized Clustering

We propose a deep amortized clustering (DAC), a neural architecture which learns to cluster datasets efficiently using a few forward passes. DAC implicitly learns what makes a cluster, how to group data points into clusters, and how to count the number of clusters in datasets. DAC is meta-learned using labelled datasets for training, a process distinct from traditional clustering algorithms which usually require hand-specified prior knowledge about cluster shapes/structures. We empirically show, on both synthetic and image data, that DAC can efficiently and accurately cluster new datasets coming from the same distribution used to generate training datasets.

[1]  Alex Graves,et al.  Adaptive Computation Time for Recurrent Neural Networks , 2016, ArXiv.

[2]  Ali Farhadi,et al.  Unsupervised Deep Embedding for Clustering Analysis , 2015, ICML.

[3]  Gregory D. Hager,et al.  User Experience of the CoSTAR System for Instruction of Collaborative Robots , 2017, ArXiv.

[4]  Joshua B. Tenenbaum,et al.  Meta-Learning for Semi-Supervised Few-Shot Classification , 2018, ICLR.

[5]  Hugo Larochelle,et al.  MADE: Masked Autoencoder for Distribution Estimation , 2015, ICML.

[6]  Amos J. Storkey,et al.  Towards a Neural Statistician , 2016, ICLR.

[7]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[8]  L. Hubert,et al.  Comparing partitions , 1985 .

[9]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Gregory Cohen,et al.  EMNIST: an extension of MNIST to handwritten letters , 2017, CVPR 2017.

[11]  Daniel Cremers,et al.  Clustering with Deep Learning: Taxonomy and New Methods , 2018, ArXiv.

[12]  Bo Zhang,et al.  Discriminatively Boosted Image Clustering with Fully Convolutional Auto-Encoders , 2017, Pattern Recognit..

[13]  Bo Yang,et al.  Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering , 2016, ICML.

[14]  Ian Davidson,et al.  On constrained spectral clustering and its applications , 2012, Data Mining and Knowledge Discovery.

[15]  Xinlei Chen,et al.  Large Scale Spectral Clustering with Landmark-Based Representation , 2011, AAAI.

[16]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[17]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[18]  Alexander J. Smola,et al.  Deep Sets , 2017, 1703.06114.

[19]  Raymond J. Mooney,et al.  Integrating constraints and metric learning in semi-supervised clustering , 2004, ICML.

[20]  Dhruv Batra,et al.  Joint Unsupervised Learning of Deep Representations and Image Clusters , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Hugo Larochelle,et al.  Centroid Networks for Few-Shot Clustering and Unsupervised Few-Shot Classification , 2019, ArXiv.

[22]  Liam Paninski,et al.  Discrete Neural Processes , 2018, ArXiv.

[23]  Joshua B. Tenenbaum,et al.  Human-level concept learning through probabilistic program induction , 2015, Science.

[24]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[25]  Yee Whye Teh,et al.  Conditional Neural Processes , 2018, ICML.

[26]  Noah D. Goodman,et al.  Learning Stochastic Inverses , 2013, NIPS.

[27]  Noah D. Goodman,et al.  Amortized Inference in Probabilistic Reasoning , 2014, CogSci.

[28]  Zsolt Kira,et al.  Learning to cluster in order to Transfer across domains and tasks , 2017, ICLR.

[29]  Razvan Pascanu,et al.  Meta-Learning with Latent Embedding Optimization , 2018, ICLR.

[30]  Yee Whye Teh,et al.  Set Transformer , 2018, ICML.

[31]  Zsolt Kira,et al.  Multi-class Classification without Multi-class Labels , 2019, ICLR.

[32]  Xu Ji,et al.  Invariant Information Clustering for Unsupervised Image Classification and Segmentation , 2019 .

[33]  Pieter Abbeel,et al.  Variational Lossy Autoencoder , 2016, ICLR.

[34]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[35]  Liam Paninski,et al.  Neural Clustering Processes , 2020, ICML.

[36]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[37]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[38]  Hujun Bao,et al.  Understanding the Power of Clause Learning , 2009, IJCAI.

[39]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[40]  Hugo Larochelle,et al.  Are Few-Shot Learning Benchmarks too Simple ? Solving them without Task Supervision at Test-Time , 2019 .