暂无分享,去创建一个
[1] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.
[2] Ronen Basri,et al. Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.
[3] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[4] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[5] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[6] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[7] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[8] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[10] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[11] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[12] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.
[13] Yann LeCun,et al. Deep multi-scale video prediction beyond mean square error , 2015, ICLR.
[14] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Antonio Torralba,et al. Anticipating Visual Representations from Unlabeled Video , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Sebastian Ramos,et al. The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Antonio Torralba,et al. Generating Videos with Scene Dynamics , 2016, NIPS.
[18] Sergey Levine,et al. Unsupervised Learning for Physical Interaction through Video Prediction , 2016, NIPS.
[19] Shunta Saito,et al. Temporal Generative Adversarial Nets with Singular Value Clipping , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[21] Aren Jansen,et al. Audio Set: An ontology and human-labeled dataset for audio events , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Antonio Torralba,et al. Generating the Future with Adversarial Transformers , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Joon Son Chung,et al. You said that? , 2017, BMVC.
[24] Andrew Zisserman,et al. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Seunghoon Hong,et al. Decomposing Motion and Content for Natural Video Sequence Prediction , 2017, ICLR.
[26] Peter V. Gehler,et al. Semantic Video CNNs Through Representation Warping , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[27] Sergey Levine,et al. Self-Supervised Visual Planning with Temporal Skip Connections , 2017, CoRL.
[28] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[29] Xiaoou Tang,et al. LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[30] Sergey Levine,et al. Stochastic Variational Video Prediction , 2017, ICLR.
[31] Andrew Zisserman,et al. A Short Note about Kinetics-600 , 2018, ArXiv.
[32] Douglas Eck,et al. Music Transformer , 2018, 1809.04281.
[33] Serge J. Belongie,et al. Controllable Video Generation with Sparse Trajectories , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[34] Sergey Levine,et al. Stochastic Adversarial Video Prediction , 2018, ArXiv.
[35] Sjoerd van Steenkiste,et al. Towards Accurate Generative Models of Video: A New Metric & Challenges , 2018, ArXiv.
[36] Luc Van Gool,et al. Towards High Resolution Video Generation with Progressive Growing of Sliced Wasserstein GANs , 2018, ArXiv.
[37] Jan Kautz,et al. Video-to-Video Synthesis , 2018, NeurIPS.
[38] Rob Fergus,et al. Stochastic Video Generation with a Learned Prior , 2018, ICML.
[39] Jan Kautz,et al. MoCoGAN: Decomposing Motion and Content for Video Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[40] Maja Pantic,et al. End-to-End Speech-Driven Facial Animation with Temporal GANs , 2018, BMVC.
[41] Aaron C. Courville,et al. Improved Conditional VRNNs for Video Prediction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[42] Joon Son Chung,et al. You Said That?: Synthesising Talking Faces from Audio , 2019, International Journal of Computer Vision.
[43] Jeff Donahue,et al. Adversarial Video Generation on Complex Datasets , 2019 .
[44] Xiaogang Wang,et al. Video Generation From Single Semantic Label Map , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Taesung Park,et al. Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Alexandre Lacoste,et al. Quantifying the Carbon Emissions of Machine Learning , 2019, ArXiv.
[47] Min Sun,et al. Point-to-Point Video Generation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[48] Seunghoon Hong,et al. Diversity-Sensitive Conditional Generative Adversarial Networks , 2019, ICLR.
[49] Trevor Darrell,et al. Disentangling Propagation and Generation for Video Prediction , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[50] Andrew McCallum,et al. Energy and Policy Considerations for Deep Learning in NLP , 2019, ACL.
[51] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[52] Alexei A. Efros,et al. Everybody Dance Now , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[53] Stefan Winkler,et al. The Unusual Effectiveness of Averaging in GAN Training , 2018, ICLR.
[54] Anoop Cherian,et al. Sound2Sight: Generating Visual Dynamics from Sound and Context , 2020, ECCV.
[55] S. Levine,et al. VideoFlow: A Conditional Flow-Based Model for Stochastic Video Generation , 2019, ICLR.
[56] Lukasz Kaiser,et al. Reformer: The Efficient Transformer , 2020, ICLR.
[57] Karan Sapra,et al. Hierarchical Multi-Scale Attention for Semantic Segmentation , 2020, ArXiv.
[58] Shunta Saito,et al. Train Sparsely, Generate Densely: Memory-Efficient Unsupervised Training of High-Resolution Temporal GAN , 2020, International Journal of Computer Vision.
[59] Tero Karras,et al. Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[60] Saeid Nahavandi,et al. Deep learning for deepfakes creation and detection: A survey , 2019, Comput. Vis. Image Underst..
[61] A. Dantcheva,et al. G3AN: Disentangling Appearance and Motion for Video Generation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[62] Arun Mallya,et al. World-Consistent Video-to-Video Synthesis , 2020, ECCV.
[63] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[64] Nikolaos Pappas,et al. Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention , 2020, ICML.
[65] Diego de Las Casas,et al. Transformation-based Adversarial Video Prediction on Large-Scale Data , 2020, ArXiv.
[66] Payal Dhar,et al. The carbon impact of artificial intelligence , 2020, Nature Machine Intelligence.
[67] Jaesik Park,et al. Future Video Synthesis With Object Motion Prediction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[68] Jonathan T. Barron,et al. What Matters in Unsupervised Optical Flow , 2020, ECCV.
[69] P. Gallinari,et al. Stochastic Latent Residual Video Prediction , 2020, ICML.
[70] Subramanian Ramamoorthy,et al. Lower Dimensional Kernels for Video Discriminators , 2019, Neural Networks.
[71] Mark Chen,et al. Generative Pretraining From Pixels , 2020, ICML.
[72] Yejin Choi,et al. The Curious Case of Neural Text Degeneration , 2019, ICLR.
[73] Wangmeng Zuo,et al. Learning Flow-based Feature Warping for Face Frontalization with Illumination Inconsistent Supervision , 2020, ECCV.
[74] Jakob Uszkoreit,et al. Scaling Autoregressive Video Models , 2019, ICLR.
[75] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.
[76] Alec Radford,et al. Zero-Shot Text-to-Image Generation , 2021, ICML.
[77] Li Fei-Fei,et al. Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[78] Dimitris N. Metaxas,et al. A Good Image Generator Is What You Need for High-Resolution Video Synthesis , 2021, ICLR.
[79] Rethinking Attention with Performers , 2020, ICLR.
[80] B. Ommer,et al. Taming Transformers for High-Resolution Image Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[81] Evgeny Burnaev,et al. Latent Video Transformer , 2020, VISIGRAPP.
[82] Sergey Tulyakov,et al. Playable Video Generation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).