论文信息 - Partial Wasserstein Covering

Partial Wasserstein Covering

We consider a general task called partial Wasserstein covering with the goal of emulating a large dataset (e.g., application dataset) using a small dataset (e.g., development dataset) in terms of the empirical distribution by selecting a small subset from a candidate dataset and adding it to the small dataset. We model this task as a discrete optimization problem with partial Wasserstein divergence as an objective function. Although this problem is NP-hard, we prove that it has the submodular property, allowing us to use a greedy algorithm with a 0.63 approximation. However, the greedy algorithm is still inefficient because it requires linear programming for each objective function evaluation. To overcome this difficulty, we propose quasi-greedy algorithms for acceleration, which consist of a series of techniques such as sensitivity analysis based on strong duality and the socalled C-transform in the optimal transport field. Experimentally, we demonstrate that we can efficiently make two datasets similar in terms of partial Wasserstein divergence, including driving scene datasets.

[1] Gabriel Peyré,et al. Iterative Bregman Projections for Regularized Transportation Problems , 2014, SIAM J. Sci. Comput..

[2] Miguel Toro,et al. Finding representative patterns with ordered projections , 2003, Pattern Recognit..

[3] Sungzoon Cho,et al. Variational Autoencoder based Anomaly Detection using Reconstruction Probability , 2015 .

[4] E. Silerova,et al. Knowledge and information systems , 2018 .

[5] Mokhtar Z. Alaya,et al. Partial Optimal Tranport with applications on Positive-Unlabeled Learning , 2020, NeurIPS.

[6] Tony R. Martinez,et al. Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[7] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[8] Gabriel Peyré,et al. Stochastic Optimization for Large-scale Optimal Transport , 2016, NIPS.

[9] José Francisco Martínez Trinidad,et al. A review of instance selection methods , 2010, Artificial Intelligence Review.

[10] Hui Lin,et al. A Class of Submodular Functions for Document Summarization , 2011, ACL.

[11] Changjian Shui,et al. Deep Active Learning: Unified and Principled Method for Query and Training , 2020, AISTATS.

[12] Morteza Zadimoghaddam,et al. Data Summarization at Scale: A Two-Stage Submodular Approach , 2018, ICML.

[13] Hans-Peter Kriegel,et al. LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[14] Morteza Zadimoghaddam,et al. Fast Distributed Submodular Cover: Public-Private Data Summarization , 2016, NIPS.

[15] Peter E. Hart,et al. The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[16] Richard M. Karp,et al. Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[17] Silvio Savarese,et al. Active Learning for Convolutional Neural Networks: A Core-Set Approach , 2017, ICLR.

[18] Bernhard Schölkopf,et al. Support Vector Method for Novelty Detection , 1999, NIPS.

[19] Yongsub Lim,et al. RaPP: Novelty Detection with Reconstruction along Projection Pathway , 2020, ICLR.

[20] Marco Cuturi,et al. Computational Optimal Transport: With Applications to Data Science , 2019 .

[21] Sébastien Marcel,et al. Torchvision the machine-vision package of torch , 2010, ACM Multimedia.

[22] Hongyuan Zha,et al. On Scalable and Efficient Computation of Large Scale Optimal Transport , 2019, ICML.

[23] Trevor Darrell,et al. BDD100K: A Diverse Driving Video Database with Scalable Annotation Tooling , 2018, ArXiv.

[24] Charles ReVelle,et al. Central Facilities Location , 2010 .

[25] Nicolo Fusi,et al. Geometric Dataset Distances via Optimal Transport , 2020, NeurIPS.

[26] David Coeurjolly,et al. SPOT , 2019, ACM Trans. Graph..

[27] Pietro Perona,et al. Entropy-based active learning for object recognition , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[28] M. L. Fisher,et al. An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[29] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[30] A. Figalli. The Optimal Partial Transport Problem , 2010 .

[31] Filiberto Pla,et al. A Stochastic Approach to Wilson's Editing Algorithm , 2005, IbPRIA.

[32] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Joydeep Ghosh,et al. Combining clustering and active learning for the detection and learning of new image classes , 2019, Neurocomputing.

[34] Marco Cuturi,et al. Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[35] Huan Liu,et al. On Issues of Instance Selection , 2002, Data Mining and Knowledge Discovery.

[36] Yuval Rabani,et al. Linear Programming , 2007, Handbook of Approximation Algorithms and Metaheuristics.

[37] Chien-Hsing Chou,et al. The Generalized Condensed Nearest Neighbor Rule as A Data Reduction Method , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[38] Dennis L. Wilson,et al. Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[39] Andreas Krause,et al. Practical Coreset Constructions for Machine Learning , 2017, 1703.06476.

[40] James C. Bezdek,et al. Nearest prototype classifier designs: An experimental study , 2001, Int. J. Intell. Syst..

[41] Nicolò Cesa-Bianchi,et al. Advances in Neural Information Processing Systems 31 , 2018, NIPS 2018.

[42] Rishabh K. Iyer,et al. Learning Mixtures of Submodular Functions for Image Collection Summarization , 2014, NIPS.

[43] Mohiuddin Ahmed. Data summarization: a survey , 2018, Knowledge and Information Systems.

[44] Chris Mellish,et al. Advances in Instance Selection for Instance-Based Learning Algorithms , 2002, Data Mining and Knowledge Discovery.

[45] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[46] Andreas Geiger,et al. Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[47] Amotz Bar-Noy,et al. Tight Approximation Bounds for the Seminar Assignment Problem , 2016, WAOA.

[48] Horst Bunke,et al. Transforming Strings to Vector Spaces Using Prototype Selection , 2006, SSPR/SPR.

[49] Hugh B. Woodruff,et al. An algorithm for a selective nearest neighbor decision rule (Corresp.) , 1975, IEEE Trans. Inf. Theory.