Predicting Video with VQVAE

[1]  Oriol Vinyals,et al.  Neural Discrete Representation Learning , 2017, NIPS.

[2]  Karen Simonyan,et al.  The challenge of realistic music generation: modelling raw audio at scale , 2018, NeurIPS.

[3]  Sergey Levine,et al.  Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.

[4]  Pieter Abbeel,et al.  PixelSNAIL: An Improved Autoregressive Generative Model , 2017, ICML.

[5]  Diego de Las Casas,et al.  Transformation-based Adversarial Video Prediction on Large-Scale Data , 2020, ArXiv.

[6]  Serge J. Belongie,et al.  Controllable Video Generation with Sparse Trajectories , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Yale Song,et al.  Video Prediction with Appearance and Motion Conditions , 2018, ICML.

[8]  Jing Dong,et al.  On the generalization of GAN image forensics , 2019, CCBR.

[9]  Nal Kalchbrenner,et al.  Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling , 2018, ICLR.

[10]  Wei Xiong,et al.  Learning to Generate Time-Lapse Videos Using Multi-stage Dynamic Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Christopher K. I. Williams,et al.  The shape variational autoencoder: A deep generative model of part‐segmented 3D objects , 2017, Comput. Graph. Forum.

[12]  Matthias Bethge,et al.  A note on the evaluation of generative models , 2015, ICLR.

[13]  Yann LeCun,et al.  Predicting Deeper into the Future of Semantic Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Trevor Darrell,et al.  Disentangling Propagation and Generation for Video Prediction , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[16]  Shunta Saito,et al.  TGANv2: Efficient Training of Large Models for Video Generation with Multiple Subsampling Layers , 2018, ArXiv.

[17]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[18]  Martial Hebert,et al.  An Uncertain Future: Forecasting from Static Images Using Variational Autoencoders , 2016, ECCV.

[19]  Ruben Villegas,et al.  Learning to Generate Long-term Future via Hierarchical Prediction , 2017, ICML.

[20]  Jakob Uszkoreit,et al.  Scaling Autoregressive Video Models , 2019, ICLR.

[21]  B. Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[22]  Viorica Patraucean,et al.  Spatio-temporal video autoencoder with differentiable memory , 2015, ArXiv.

[23]  Sergey Levine,et al.  VideoFlow: A Flow-Based Generative Model for Video , 2019, ArXiv.

[24]  Antonio Torralba,et al.  Generating Videos with Scene Dynamics , 2016, NIPS.

[25]  Razvan Pascanu,et al.  Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.

[26]  Nitish Srivastava,et al.  Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.

[27]  Jiajun Wu,et al.  Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks , 2016, NIPS.

[28]  Junichi Yamagishi,et al.  MesoNet: a Compact Facial Video Forgery Detection Network , 2018, 2018 IEEE International Workshop on Information Forensics and Security (WIFS).

[29]  C. Lee Giles,et al.  Learning a Hierarchical Latent-Variable Model of 3D Shapes , 2017, 2018 International Conference on 3D Vision (3DV).

[30]  Ming-Hsuan Yang,et al.  Flow-Grounded Spatial-Temporal Video Prediction from Still Images , 2018, ECCV.

[31]  Ali Razavi,et al.  Generating Diverse High-Fidelity Images with VQ-VAE-2 , 2019, NeurIPS.

[32]  Sergio Gomez Colmenarejo,et al.  Parallel Multiscale Autoregressive Density Estimation , 2017, ICML.

[33]  Honglak Lee,et al.  Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.

[34]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[35]  Sergio Escalera,et al.  Folded Recurrent Neural Networks for Future Video Prediction , 2017, ECCV.

[36]  Martial Hebert,et al.  The Pose Knows: Video Forecasting by Generating Pose Futures , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[37]  Ilya Sutskever,et al.  Jukebox: A Generative Model for Music , 2020, ArXiv.

[38]  Premkumar Natarajan,et al.  Recurrent Convolutional Strategies for Face Manipulation Detection in Videos , 2019, CVPR Workshops.

[39]  Sergey Levine,et al.  Self-Supervised Visual Planning with Temporal Skip Connections , 2017, CoRL.

[40]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[41]  Jeff Donahue,et al.  Adversarial Video Generation on Complex Datasets , 2019 .

[42]  Marc'Aurelio Ranzato,et al.  Video (language) modeling: a baseline for generative models of natural videos , 2014, ArXiv.

[43]  Yoshua Bengio,et al.  Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[44]  Yann LeCun,et al.  Predicting Future Instance Segmentations by Forecasting Convolutional Features , 2018, ECCV.

[45]  Jan Kautz,et al.  MoCoGAN: Decomposing Motion and Content for Video Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46]  Yann LeCun,et al.  Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[47]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[48]  Karen Simonyan,et al.  Hierarchical Autoregressive Image Models with Auxiliary Decoders , 2019, ArXiv.

[49]  Sergey Levine,et al.  Stochastic Variational Video Prediction , 2017, ICLR.

[50]  Sjoerd van Steenkiste,et al.  Towards Accurate Generative Models of Video: A New Metric & Challenges , 2018, ArXiv.

[51]  Yitong Li,et al.  Video Generation From Text , 2017, AAAI.

[52]  David A. Shamma,et al.  The New Data and New Challenges in Multimedia Research , 2015, ArXiv.

[53]  Jürgen Schmidhuber,et al.  World Models , 2018, ArXiv.

[54]  Seunghoon Hong,et al.  Decomposing Motion and Content for Natural Video Sequence Prediction , 2017, ICLR.

[55]  Rob Fergus,et al.  Stochastic Video Generation with a Learned Prior , 2018, ICML.

[56]  Luc Van Gool,et al.  Dynamic Filter Networks , 2016, NIPS.

[57]  Andrew Zisserman,et al.  A Short Note about Kinetics-600 , 2018, ArXiv.

[58]  Andrew Owens,et al.  Fighting Fake News: Image Splice Detection via Learned Self-Consistency , 2018, ECCV.

[59]  Sergey Levine,et al.  Stochastic Adversarial Video Prediction , 2018, ArXiv.