论文信息 - Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows

Nonparametric Generative Modeling with Conditional Sliced-Wasserstein Flows

Sliced-Wasserstein Flow (SWF) is a promising approach to nonparametric generative modeling but has not been widely adopted due to its suboptimal generative quality and lack of conditional modeling capabilities. In this work, we make two major contributions to bridging this gap. First, based on a pleasant observation that (under certain conditions) the SWF of joint distributions coincides with those of conditional distributions, we propose Conditional Sliced-Wasserstein Flow (CSWF), a simple yet effective extension of SWF that enables nonparametric conditional modeling. Second, we introduce appropriate inductive biases of images into SWF with two techniques inspired by local connectivity and multiscale representation in vision research, which greatly improve the efficiency and quality of modeling images. With all the improvements, we achieve generative performance comparable with many deep parametric generative models on both conditional and unconditional tasks in a purely nonparametric fashion, demonstrating its great potential.

Min Lin | Chao Du | Tianyu Pang | Tianbo Li | Shuicheng Yan

[1] Hua Wu,et al. ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Nhat Ho,et al. Hierarchical Sliced Wasserstein Distance , 2022, ICLR.

[3] J. Tenenbaum,et al. Prompt-to-Prompt Image Editing with Cross Attention Control , 2022, ICLR.

[4] Jing Yu Koh,et al. Scaling Autoregressive Models for Content-Rich Text-to-Image Generation , 2022, Trans. Mach. Learn. Res..

[5] Valero Laparra,et al. Orthonormal Convolutions for the Rotation Based Iterative Gaussianization , 2022, 2022 IEEE International Conference on Image Processing (ICIP).

[6] David J. Fleet,et al. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding , 2022, NeurIPS.

[7] Bowen Jing,et al. Subspace Diffusion Generative Models , 2022, ECCV.

[8] Prafulla Dhariwal,et al. Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.

[9] Nhat Ho,et al. Revisiting Sliced Wasserstein on Images: From Vectorization to Convolution , 2022, NeurIPS.

[10] Nhat Ho,et al. Amortized Projection Optimization for Sliced Wasserstein Generative Models , 2022, NeurIPS.

[11] B. Ommer,et al. High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12] S. Ermon,et al. SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations , 2021, ICLR.

[13] Cheng Meng,et al. Large-scale optimal transport map estimation using projection pursuit , 2021, NeurIPS.

[14] Abhishek Kumar,et al. Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.

[15] Rewon Child,et al. Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images , 2020, ICLR.

[16] Khai Nguyen,et al. Improving Relational Regularized Autoencoders with Spherical Sliced Fused Gromov Wasserstein , 2020, ICLR.

[17] Uros Seljak,et al. Sliced Iterative Normalizing Flows , 2020, ICML.

[18] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.

[19] Jun Zhu,et al. Nonparametric Score Estimators , 2020, ICML.

[20] Stefano Ermon,et al. Gaussianization Flows , 2020, AISTATS.

[21] Khai Nguyen,et al. Distributional Sliced-Wasserstein and Applications to Generative Modeling , 2020, ICLR.

[22] Yang Song,et al. Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.

[23] Iain Murray,et al. Neural Spline Flows , 2019, NeurIPS.

[24] David A. Forsyth,et al. Max-Sliced Wasserstein Distance and Its Use for GANs , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Luc Van Gool,et al. Sliced Wasserstein Generative Models , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26] David Duvenaud,et al. FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[27] Gustavo K. Rohde,et al. Sliced Wasserstein Auto-Encoders , 2018, ICLR.

[28] Jeff Donahue,et al. Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[29] Antoine Liutkus,et al. Sliced-Wasserstein Flows: Nonparametric Generative Modeling via Optimal Transport and Diffusions , 2018, ICML.

[30] Jun Zhu,et al. A Spectral Approach to Gradient Estimation for Implicit Distributions , 2018, ICML.

[31] Alexander G. Schwing,et al. Generative Modeling Using the Sliced Wasserstein Distance , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32] Aaron C. Courville,et al. FiLM: Visual Reasoning with a General Conditioning Layer , 2017, AAAI.

[33] Roland Vollgraf,et al. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[34] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[35] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.

[36] Richard E. Turner,et al. Gradient Estimators for Implicit Models , 2017, ICLR.

[37] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[38] Serge J. Belongie,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Andrei A. Rusu,et al. Overcoming catastrophic forgetting in neural networks , 2016, Proceedings of the National Academy of Sciences.

[40] F. Santambrogio. {Euclidean, metric, and Wasserstein} gradient flows: an overview , 2016, 1609.03890.

[41] Alex Graves,et al. Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[42] Honglak Lee,et al. Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[43] Shakir Mohamed,et al. Variational Inference with Normalizing Flows , 2015, ICML.

[44] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[45] Xiaogang Wang,et al. Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[46] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.

[47] Aaron C. Courville,et al. Generative Adversarial Nets , 2014, NIPS.

[48] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[49] Nicolas Bonnotte. Unidimensional and Evolution Methods for Optimal Transportation , 2013 .

[50] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[51] Julien Rabin,et al. Wasserstein Barycenter and Its Application to Texture Mixing , 2011, SSVM.

[52] Valero Laparra,et al. Iterative Gaussianization: From ICA to Random Rotations , 2011, IEEE Transactions on Neural Networks.

[53] Quoc V. Le,et al. Tiled convolutional neural networks , 2010, NIPS.

[54] C. Villani. Optimal Transport: Old and New , 2008 .

[55] L. Kantorovich. On the Translocation of Masses , 2006 .

[56] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[57] Edward H. Adelson,et al. PYRAMID METHODS IN IMAGE PROCESSING. , 1984 .

[58] Jacek Tabor,et al. Cramer-Wold Auto-Encoder , 2020, J. Mach. Learn. Res..

[59] Igor Mordatch,et al. Implicit Generation and Generalization with Energy Based Models , 2018 .

[60] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[61] L. Ambrosio,et al. Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .

[62] Yann Brenier,et al. A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem , 2000, Numerische Mathematik.

[63] D. Kinderlehrer,et al. THE VARIATIONAL FORMULATION OF THE FOKKER-PLANCK EQUATION , 1996 .

[64] G. M.,et al. Partial Differential Equations I , 2023, Applied Mathematical Sciences.