Structured Output Learning with Conditional Generative Flows

Traditional structured prediction models try to learn the conditional likelihood, i.e., p(y|x), to capture the relationship between the structured output y and the input features x. For many models, computing the likelihood is intractable. These models are therefore hard to train, requiring the use of surrogate objectives or variational inference to approximate likelihood. In this paper, we propose conditional Glow (c-Glow), a conditional generative flow for structured output learning. C-Glow benefits from the ability of flow-based models to compute p(y|x) exactly and efficiently. Learning with c-Glow does not require a surrogate objective or performing inference during training. Once trained, we can directly and efficiently generate conditional samples. We develop a sample-based prediction method, which can use this advantage to do efficient and effective inference. In our experiments, we test c-Glow on five different tasks. C-Glow outperforms the state-of-the-art baselines in some tasks and predicts comparable outputs in the other tasks. The results show that c-Glow is versatile and is applicable to many different structured prediction problems.

[1]  Stefan Roth,et al.  Shrinkage Fields for Effective Image Restoration , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[3]  David Duvenaud,et al.  FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models , 2018, ICLR.

[4]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[5]  Antonio Criminisi,et al.  Filter Forests for Learning Data-Dependent Convolutional Kernels , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Iain Murray,et al.  Masked Autoregressive Flow for Density Estimation , 2017, NIPS.

[7]  Andrew McCallum,et al.  End-to-End Learning for Structured Prediction Energy Networks , 2017, ICML.

[8]  Richard E. Turner,et al.  Conditional Density Estimation with Bayesian Normalising Flows , 2018, 1802.04908.

[9]  Alessandro Foi,et al.  Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering , 2007, IEEE Transactions on Image Processing.

[10]  Andrew McCallum,et al.  Structured Prediction Energy Networks , 2015, ICML.

[11]  Honglak Lee,et al.  Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.

[12]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[13]  Ofer Meshi,et al.  Deep Structured Prediction with Nonlinear Output Transformations , 2018, NeurIPS.

[14]  David Zhang,et al.  Multi-channel Weighted Nuclear Norm Minimization for Real Color Image Denoising , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Erik G. Learned-Miller,et al.  Unsupervised Joint Alignment of Complex Images , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[16]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Alexander M. Rush,et al.  Latent Normalizing Flows for Discrete Sequences , 2019, ICML.

[18]  Alexandre Lacoste,et al.  Neural Autoregressive Flows , 2018, ICML.

[19]  Max Welling,et al.  Emerging Convolutions for Generative Normalizing Flows , 2019, ICML.

[20]  Alan L. Yuille,et al.  Learning Deep Structured Models , 2014, ICML.

[21]  David Duvenaud,et al.  Neural Ordinary Differential Equations , 2018, NeurIPS.

[22]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[24]  Minh N. Do,et al.  Semantic Image Inpainting with Deep Generative Models , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Honglak Lee,et al.  Augmenting CRFs with Boltzmann Machine Shape Priors for Image Labeling , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Matthias Bethge,et al.  A note on the evaluation of generative models , 2015, ICLR.

[27]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[28]  Lifu Tu,et al.  Learning Approximate Inference Networks for Structured Prediction , 2018, ICLR.

[29]  Michael J. Black,et al.  Fields of Experts , 2009, International Journal of Computer Vision.

[30]  S. Tsogkas,et al.  Deep Learning for Semantic Part Segmentation with High-Level Guidance , 2015 .

[31]  Eduard H. Hovy,et al.  MaCow: Masked Convolutional Generative Flow , 2019, NeurIPS.

[32]  Xiaogang Wang,et al.  Deep Learning Face Attributes in the Wild , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[33]  Thomas S. Huang,et al.  Generative Image Inpainting with Contextual Attention , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[35]  I JordanMichael,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008 .

[36]  Mohammad Norouzi,et al.  Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs , 2017, ICML.

[37]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Sebastian Nowozin,et al.  Structured Learning and Prediction in Computer Vision , 2011, Found. Trends Comput. Graph. Vis..

[39]  Max Welling,et al.  Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Pieter Abbeel,et al.  Flow++: Improving Flow-Based Generative Models with Variational Dequantization and Architecture Design , 2019, ICML.

[42]  Prafulla Dhariwal,et al.  Glow: Generative Flow with Invertible 1x1 Convolutions , 2018, NeurIPS.

[43]  Samy Bengio,et al.  Density estimation using Real NVP , 2016, ICLR.

[44]  Shimon Ullman,et al.  Class-Specific, Top-Down Segmentation , 2002, ECCV.

[45]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[46]  Sanja Fidler,et al.  Proximal Deep Structured Models , 2016, NIPS.

[47]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[48]  Dale Schuurmans,et al.  Reward Augmented Maximum Likelihood for Neural Structured Prediction , 2016, NIPS.

[49]  Yoshua Bengio,et al.  NICE: Non-linear Independent Components Estimation , 2014, ICLR.