Multilevel Clustering via Wasserstein Means

We propose a novel approach to the problem of multilevel clustering, which aims to simultaneously partition data in each group and discover grouping patterns among groups in a potentially large hierarchically structured corpus of data. Our method involves a joint optimization formulation over several spaces of discrete probability measures, which are endowed with Wasserstein distance metrics. We propose a number of variants of this problem, which admit fast optimization algorithms, by exploiting the connection to the problem of finding Wasserstein barycenters. Consistency properties are established for the estimates of both local and global clusters. Finally, experiment results with both synthetic and real data are presented to demonstrate the flexibility and scalability of the proposed approach.

[1]  David Pollard,et al.  Quantization and the method of k -means , 1982, IEEE Trans. Inf. Theory.

[2]  S. Graf,et al.  Foundations of Quantization for Probability Distributions , 2000 .

[3]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[4]  C. Villani Topics in Optimal Transportation , 2003 .

[5]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[6]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[7]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[8]  A. Gelfand,et al.  The Nested Dirichlet Process , 2008 .

[9]  Guillaume Carlier,et al.  Barycenters in the Wasserstein Space , 2011, SIAM J. Math. Anal..

[10]  XuanLong Nguyen,et al.  Posterior contraction of the population polytope in finite admixture models , 2012, ArXiv.

[11]  Michael I. Jordan,et al.  Revisiting k-means: New Algorithms via Bayesian Nonparametrics , 2011, ICML.

[12]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[13]  X. Nguyen Convergence of latent mixing measures in finite and infinite mixture models , 2011, 1109.3250.

[14]  Qiaozhu Mei,et al.  Understanding the Limiting Factors of Topic Modeling via Posterior Contraction Analysis , 2014, ICML.

[15]  Dinh Q. Phung,et al.  Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts , 2014, ICML.

[16]  Arnaud Doucet,et al.  Fast Computation of Wasserstein Barycenters , 2013, ICML.

[17]  J. A. Cuesta-Albertos,et al.  A fixed-point approach to barycenters in Wasserstein space , 2015, 1511.05355.

[18]  Gabriel Peyré,et al.  Iterative Bregman Projections for Regularized Transportation Problems , 2014, SIAM J. Sci. Comput..

[19]  Volkan Cevher,et al.  WASP: Scalable Bayes via barycenters of subset posteriors , 2015, AISTATS.

[20]  Shane T. Jensen,et al.  Nonparametric multi-level clustering of human epilepsy seizures , 2016 .

[21]  Steffen Borgwardt,et al.  Discrete Wasserstein barycenters: optimal transport for discrete data , 2015, Mathematical Methods of Operations Research.

[22]  Svetha Venkatesh,et al.  Scalable Nonparametric Bayesian Multilevel Clustering , 2016, UAI.

[23]  Svetha Venkatesh,et al.  MCNC: Multi-Channel Nonparametric Clustering from heterogeneous data , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).