A Swiss Army Knife for Minimax Optimal Transport

The Optimal transport (OT) problem and its associated Wasserstein distance have recently become a topic of great interest in the machine learning community. However, its underlying optimization problem is known to have two major restrictions: (i) it strongly depends on the choice of the cost function and (ii) its sample complexity scales exponentially with the dimension. In this paper, we propose a general formulation of a minimax OT problem that can tackle these limitations by jointly optimizing the cost matrix and the transport plan, allowing us to define a robust distance between distributions. We propose to use a cutting-set method to solve this general problem and show its links and advantages compared to other existing minimax OT approaches. Additionally, we use this method to define a notion of stability allowing us to select the ground metric robust to bounded perturbations. Finally, we provide an experimental study highlighting the efficiency of our approach.

[1]  M. Sion On general minimax theorems , 1958 .

[2]  Julien Rabin,et al.  Regularized Discrete Optimal Transport , 2013, SIAM J. Imaging Sci..

[3]  Ievgen Redko,et al.  Co-clustering through Optimal Transport , 2017, ICML.

[4]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[5]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[6]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[7]  Marco Cuturi,et al.  Subspace Robust Wasserstein distances , 2019, ICML.

[8]  Gabriel Peyré,et al.  Wasserstein barycentric coordinates , 2016, ACM Trans. Graph..

[9]  A representation theorem for (trAP)1/p , 1987 .

[10]  Gabriel Peyré,et al.  Fast Optimal Transport Averaging of Neuroimaging Data , 2015, IPMI.

[11]  Tommi S. Jaakkola,et al.  Structured Optimal Transport , 2018, AISTATS.

[12]  Gabriel Peyré,et al.  Learning Generative Models with Sinkhorn Divergences , 2017, AISTATS.

[13]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[14]  David Avis,et al.  Ground metric learning , 2011, J. Mach. Learn. Res..

[15]  Zhi-Hua Zhou,et al.  Label Distribution Learning by Optimal Transport , 2018, AAAI.

[16]  Viet Anh Nguyen,et al.  Wasserstein Distributionally Robust Optimization: Theory and Applications in Machine Learning , 2019, Operations Research & Management Science in the Age of Analytics.

[17]  L. Kantorovich On the Translocation of Masses , 2006 .

[18]  Leland McInnes,et al.  UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction , 2018, ArXiv.

[19]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[20]  Tommi S. Jaakkola,et al.  Towards Optimal Transport with Global Invariances , 2018, AISTATS.

[21]  Jonathan Weed,et al.  Statistical Optimal Transport via Factored Couplings , 2018, AISTATS.

[22]  Léon Bottou,et al.  Wasserstein Generative Adversarial Networks , 2017, ICML.

[23]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[24]  Ruilin Li,et al.  Learning to Match via Inverse Optimal Transport , 2018, J. Mach. Learn. Res..

[25]  Nicolas Courty,et al.  Joint distribution optimal transportation for domain adaptation , 2017, NIPS.

[26]  Stephen P. Boyd,et al.  Cutting-set methods for robust convex optimization with pessimizing oracles , 2009, Optim. Methods Softw..