DRAW: A Recurrent Neural Network For Image Generation

This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation. DRAW networks combine a novel spatial attention mechanism that mimics the foveation of the human eye, with a sequential variational auto-encoding framework that allows for the iterative construction of complex images. The system substantially improves on the state of the art for generative models on MNIST, and, when trained on the Street View House Numbers dataset, it generates images that cannot be distinguished from real data with the naked eye.

[1]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[4]  Jürgen Schmidhuber,et al.  Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.

[5]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[6]  Ruslan Salakhutdinov,et al.  On the quantitative analysis of deep belief networks , 2008, ICML '08.

[7]  Ruslan Salakhutdinov,et al.  Evaluating probabilities under high-dimensional latent variable models , 2008, NIPS.

[8]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[9]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[10]  Geoffrey E. Hinton,et al.  Learning to combine foveal glimpses with a third-order Boltzmann machine , 2010, NIPS.

[11]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[12]  Hugo Larochelle,et al.  The Neural Autoregressive Distribution Estimator , 2011, AISTATS.

[13]  Misha Denil,et al.  Learning Where to Attend with Deep Architectures for Image Tracking , 2011, Neural Computation.

[14]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[15]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[16]  Nitish Srivastava,et al.  Learning Generative Models with Visual Attention , 2013, NIPS.

[17]  Hugo Larochelle,et al.  A Neural Autoregressive Approach to Attention-based Recognition , 2015, International Journal of Computer Vision.

[18]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[19]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[20]  Daan Wierstra,et al.  Deep AutoRegressive Networks , 2013, ICML.

[21]  Hugo Larochelle,et al.  A Deep and Tractable Density Estimator , 2013, ICML.

[22]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[23]  Marc'Aurelio Ranzato,et al.  On Learning Where To Look , 2014, ArXiv.

[24]  Yaroslav Bulatov,et al.  Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks , 2013, ICLR.

[25]  Tijmen Tieleman,et al.  Optimizing Neural Networks that Generate Iimages , 2014 .

[26]  Tapani Raiko,et al.  Iterative Neural Autoregressive Distribution Estimator NADE-k , 2014, NIPS.

[27]  Alex Graves,et al.  Neural Turing Machines , 2014, ArXiv.

[28]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[29]  Pierre Sermanet,et al.  Attention for Fine-Grained Categorization , 2014, ICLR.

[30]  Max Welling,et al.  Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.

[31]  Koray Kavukcuoglu,et al.  Multiple Object Recognition with Visual Attention , 2014, ICLR.

[32]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.