Prediction Under Uncertainty with Error-Encoding Networks

In this work we introduce a new framework for performing temporal predictions in the presence of uncertainty. It is based on a simple idea of disentangling com- ponents of the future state which are predictable from those which are inherently unpredictable, and encoding the unpredictable components into a low-dimensional latent variable which is fed into the forward model. Our method uses a simple su- pervised training objective which is fast and easy to train. We evaluate it in the context of video prediction on multiple datasets and show that it is able to consi- tently generate diverse predictions without the need for alternating minimization over a latent space or adversarial training.

[1]  Jitendra Malik,et al.  Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[2]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[3]  Antonio Torralba,et al.  Anticipating the future by watching unlabeled video , 2015, ArXiv.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Kyunghyun Cho,et al.  Query-Efficient Imitation Learning for End-to-End Autonomous Driving , 2016, ArXiv.

[6]  Yann LeCun,et al.  Learning to Linearize Under Uncertainty , 2015, NIPS.

[7]  David Lopez-Paz,et al.  Optimizing the Latent Space of Generative Networks , 2017, ICML.

[8]  Vighnesh Birodkar,et al.  Unsupervised Learning of Disentangled Representations from Video , 2017, NIPS.

[9]  Yann LeCun,et al.  Learning Fast Approximations of Sparse Coding , 2010, ICML.

[10]  Nitish Srivastava,et al.  Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.

[11]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[12]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[13]  Sergey Levine,et al.  Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.

[14]  Bernhard Schölkopf,et al.  AdaGAN: Boosting Generative Models , 2017, NIPS.

[15]  Rajesh P. N. Rao,et al.  Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. , 1999 .

[16]  Gabriel Kreiman,et al.  Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning , 2016, ICLR.

[17]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jürgen Schmidhuber,et al.  Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.

[19]  José Carlos Príncipe,et al.  Deep Predictive Coding Networks , 2013, ICLR.

[20]  Marc G. Bellemare,et al.  The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[21]  Alex Graves,et al.  Video Pixel Networks , 2016, ICML.

[22]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[23]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[24]  Michael W. Spratling Predictive coding as a model of biased competition in visual attention , 2008, Vision Research.

[25]  David Pfau,et al.  Unrolled Generative Adversarial Networks , 2016, ICLR.

[26]  Yann LeCun,et al.  Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[27]  Seunghoon Hong,et al.  Decomposing Motion and Content for Natural Video Sequence Prediction , 2017, ICLR.