论文信息 - PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications - 字舞流文

PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications

PixelCNNs are a recently proposed class of powerful generative models with tractable likelihood. Here we discuss our implementation of PixelCNNs which we make available at this https URL Our implementation contains a number of modifications to the original model that both simplify its structure and improve its performance. 1) We use a discretized logistic mixture likelihood on the pixels, rather than a 256-way softmax, which we find to speed up training. 2) We condition on whole pixels, rather than R/G/B sub-pixels, simplifying the model structure. 3) We use downsampling to efficiently capture structure at multiple resolutions. 4) We introduce additional short-cut connections to further speed up optimization. 5) We regularize the model using dropout. Finally, we present state-of-the-art log likelihood results on CIFAR-10 to demonstrate the usefulness of these modifications.

Xi Chen | Tim Salimans | Andrej Karpathy | Diederik P. Kingma | A. Karpathy | Xi Chen | Tim Salimans

[1] Yiannis Aloimonos,et al. Who killed the directed model? , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[2] M. Bethge,et al. Mixtures of Conditional Gaussian Scale Mixtures Applied to Multiscale Image Representations , 2011, PloS one.

[3] Hugo Larochelle,et al. RNADE: The real-valued neural autoregressive density-estimator , 2013, NIPS.

[4] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[5] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[6] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[7] Surya Ganguli,et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.

[8] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[9] Alex Graves,et al. DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.

[10] Yoshua Bengio,et al. NICE: Non-linear Independent Components Estimation , 2014, ICLR.

[11] Aäron van den Oord,et al. Locally-connected transformations for deep GMMs , 2015, ICML 2015.

[12] Matthias Bethge,et al. Generative Image Modeling Using Spatial LSTMs , 2015, NIPS.

[13] Alex Graves,et al. Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[14] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.

[15] Koray Kavukcuoglu,et al. Pixel Recurrent Neural Networks , 2016, ICML.

[16] Xinyun Chen. Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[17] Alex Graves,et al. Neural Machine Translation in Linear Time , 2016, ArXiv.

[18] Daan Wierstra,et al. Towards Conceptual Compression , 2016, NIPS.

[19] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[20] Samy Bengio,et al. Density estimation using Real NVP , 2016, ICLR.

[21] Max Welling,et al. Improved Variational Inference with Inverse Autoregressive Flow , 2016, NIPS 2016.

[22] Alex Graves,et al. Video Pixel Networks , 2016, ICML.