论文信息 - Predicting Video with VQVAE - 字舞流文

Predicting Video with VQVAE

Aäron van den Oord | Ali Razavi | Jacob Walker

[1] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.

[2] Karen Simonyan,et al. The challenge of realistic music generation: modelling raw audio at scale , 2018, NeurIPS.

[3] Sergey Levine,et al. Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.

[4] Pieter Abbeel,et al. PixelSNAIL: An Improved Autoregressive Generative Model , 2017, ICML.

[5] Diego de Las Casas,et al. Transformation-based Adversarial Video Prediction on Large-Scale Data , 2020, ArXiv.

[6] Serge J. Belongie,et al. Controllable Video Generation with Sparse Trajectories , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7] Yale Song,et al. Video Prediction with Appearance and Motion Conditions , 2018, ICML.

[8] Jing Dong,et al. On the generalization of GAN image forensics , 2019, CCBR.

[9] Nal Kalchbrenner,et al. Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling , 2018, ICLR.

[10] Wei Xiong,et al. Learning to Generate Time-Lapse Videos Using Multi-stage Dynamic Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11] Christopher K. I. Williams,et al. The shape variational autoencoder: A deep generative model of part‐segmented 3D objects , 2017, Comput. Graph. Forum.

[12] Matthias Bethge,et al. A note on the evaluation of generative models , 2015, ICLR.

[13] Yann LeCun,et al. Predicting Deeper into the Future of Semantic Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14] Trevor Darrell,et al. Disentangling Propagation and Generation for Video Prediction , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15] Jeff Donahue,et al. Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[16] Shunta Saito,et al. TGANv2: Efficient Training of Large Models for Video Generation with Multiple Subsampling Layers , 2018, ArXiv.

[17] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .

[18] Martial Hebert,et al. An Uncertain Future: Forecasting from Static Images Using Variational Autoencoders , 2016, ECCV.

[19] Ruben Villegas,et al. Learning to Generate Long-term Future via Hierarchical Prediction , 2017, ICML.

[20] Jakob Uszkoreit,et al. Scaling Autoregressive Video Models , 2019, ICLR.

[21] B. Caputo,et al. Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[22] Viorica Patraucean,et al. Spatio-temporal video autoencoder with differentiable memory , 2015, ArXiv.

[23] Sergey Levine,et al. VideoFlow: A Flow-Based Generative Model for Video , 2019, ArXiv.

[24] Antonio Torralba,et al. Generating Videos with Scene Dynamics , 2016, NIPS.

[25] Razvan Pascanu,et al. Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.

[26] Nitish Srivastava,et al. Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.

[27] Jiajun Wu,et al. Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks , 2016, NIPS.

[28] Junichi Yamagishi,et al. MesoNet: a Compact Facial Video Forgery Detection Network , 2018, 2018 IEEE International Workshop on Information Forensics and Security (WIFS).

[29] C. Lee Giles,et al. Learning a Hierarchical Latent-Variable Model of 3D Shapes , 2017, 2018 International Conference on 3D Vision (3DV).

[30] Ming-Hsuan Yang,et al. Flow-Grounded Spatial-Temporal Video Prediction from Still Images , 2018, ECCV.

[31] Ali Razavi,et al. Generating Diverse High-Fidelity Images with VQ-VAE-2 , 2019, NeurIPS.

[32] Sergio Gomez Colmenarejo,et al. Parallel Multiscale Autoregressive Density Estimation , 2017, ICML.

[33] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.

[34] Jaakko Lehtinen,et al. Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[35] Sergio Escalera,et al. Folded Recurrent Neural Networks for Future Video Prediction , 2017, ECCV.

[36] Martial Hebert,et al. The Pose Knows: Video Forecasting by Generating Pose Futures , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[37] Ilya Sutskever,et al. Jukebox: A Generative Model for Music , 2020, ArXiv.

[38] Premkumar Natarajan,et al. Recurrent Convolutional Strategies for Face Manipulation Detection in Videos , 2019, CVPR Workshops.

[39] Sergey Levine,et al. Self-Supervised Visual Planning with Temporal Skip Connections , 2017, CoRL.

[40] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[41] Jeff Donahue,et al. Adversarial Video Generation on Complex Datasets , 2019 .

[42] Marc'Aurelio Ranzato,et al. Video (language) modeling: a baseline for generative models of natural videos , 2014, ArXiv.

[43] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[44] Yann LeCun,et al. Predicting Future Instance Segmentations by Forecasting Convolutional Features , 2018, ECCV.

[45] Jan Kautz,et al. MoCoGAN: Decomposing Motion and Content for Video Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46] Yann LeCun,et al. Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[47] Koray Kavukcuoglu,et al. Pixel Recurrent Neural Networks , 2016, ICML.

[48] Karen Simonyan,et al. Hierarchical Autoregressive Image Models with Auxiliary Decoders , 2019, ArXiv.

[49] Sergey Levine,et al. Stochastic Variational Video Prediction , 2017, ICLR.

[50] Sjoerd van Steenkiste,et al. Towards Accurate Generative Models of Video: A New Metric & Challenges , 2018, ArXiv.

[51] Yitong Li,et al. Video Generation From Text , 2017, AAAI.

[52] David A. Shamma,et al. The New Data and New Challenges in Multimedia Research , 2015, ArXiv.

[53] Jürgen Schmidhuber,et al. World Models , 2018, ArXiv.

[54] Seunghoon Hong,et al. Decomposing Motion and Content for Natural Video Sequence Prediction , 2017, ICLR.

[55] Rob Fergus,et al. Stochastic Video Generation with a Learned Prior , 2018, ICML.

[56] Luc Van Gool,et al. Dynamic Filter Networks , 2016, NIPS.

[57] Andrew Zisserman,et al. A Short Note about Kinetics-600 , 2018, ArXiv.

[58] Andrew Owens,et al. Fighting Fake News: Image Splice Detection via Learned Self-Consistency , 2018, ECCV.

[59] Sergey Levine,et al. Stochastic Adversarial Video Prediction , 2018, ArXiv.