Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
暂无分享,去创建一个
Xintao Wang | Yong Zhang | Yuan Gong | Qifeng Chen | Menghan Xia | Yin-Yin He | Xiaodong Cun | Ying Shan | Haoxin Chen | Jinbo Xing | Yuan Gong | Chao-Liang Weng | Qifeng Chen
[1] Chen Change Loy,et al. Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation , 2023, ArXiv.
[2] Jingren Zhou,et al. VideoComposer: Compositional Video Synthesis with Motion Controllability , 2023, NeurIPS.
[3] Xintao Wang,et al. Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance , 2023, IEEE transactions on visualization and computer graphics.
[4] Fu Lee Wang,et al. Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising , 2023, ArXiv.
[5] D. Cohen-Or,et al. A Neural Space-Time Representation for Text-to-Image Personalization , 2023, ACM Trans. Graph..
[6] W. Zuo,et al. ControlVideo: Training-free Controllable Text-to-Video Generation , 2023, ArXiv.
[7] Seung Wook Kim,et al. Align Your Latents: High-Resolution Video Synthesis with Latent Diffusion Models , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Xintao Wang,et al. Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos , 2023, AAAI.
[9] Humphrey Shi,et al. Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[10] Yan Huang,et al. VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Henrique Pondé de Oliveira Pinto,et al. GPT-4 Technical Report , 2023, 2303.08774.
[12] Lei Zhang,et al. ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation , 2023, ArXiv.
[13] Jingren Zhou,et al. Composer: Creative and Controllable Image Synthesis with Composable Conditions , 2023, ICML.
[14] Jinwoo Shin,et al. Video Probabilistic Diffusion Models in Projected Latent Space , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Kihyuk Sohn,et al. MAGVIT: Masked Generative Video Transformer , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[16] Nupur Kumari,et al. Multi-Concept Customization of Text-to-Image Diffusion , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Vishal M. Patel,et al. VIDM: Video Implicit Diffusion Models , 2022, AAAI.
[18] D. Cohen-Or,et al. Sketch-Guided Text-to-Image Diffusion Models , 2022, SIGGRAPH.
[19] Jiashi Feng,et al. MagicVideo: Efficient Video Generation With Latent Diffusion Models , 2022, ArXiv.
[20] D. Erhan,et al. Phenaki: Variable Length Video Generation From Open Domain Textual Description , 2022, ICLR.
[21] David J. Fleet,et al. Imagen Video: High Definition Video Generation with Diffusion Models , 2022, ArXiv.
[22] Yaniv Taigman,et al. Make-A-Video: Text-to-Video Generation without Text-Video Data , 2022, ICLR.
[23] Yuanzhen Li,et al. DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Amit H. Bermano,et al. An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion , 2022, ICLR.
[25] Wendi Zheng,et al. CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers , 2022, ICLR.
[26] David J. Fleet,et al. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding , 2022, NeurIPS.
[27] Prafulla Dhariwal,et al. Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.
[28] David J. Fleet,et al. Video Diffusion Models , 2022, NeurIPS.
[29] Devi Parikh,et al. Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer , 2022, ECCV.
[30] Mohamed Elhoseiny,et al. StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2 , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[31] B. Ommer,et al. High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Yelong Shen,et al. LoRA: Low-Rank Adaptation of Large Language Models , 2021, ICLR.
[33] Guillermo Sapiro,et al. GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions , 2021, ArXiv.
[34] Pieter Abbeel,et al. VideoGPT: Video Generation using VQ-VAE and Transformers , 2021, ArXiv.
[35] Andrew Zisserman,et al. Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[36] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[37] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.
[38] Konrad Schindler,et al. Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[39] Jan Kautz,et al. MoCoGAN: Decomposing Motion and Content for Video Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[40] Shunta Saito,et al. Temporal Generative Adversarial Nets with Singular Value Clipping , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[41] Antonio Torralba,et al. Generating Videos with Scene Dynamics , 2016, NIPS.
[42] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[43] Mubarak Shah,et al. UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.
[44] Qifeng Chen,et al. Latent Video Diffusion Models for High-Fidelity Video Generation with Arbitrary Lengths , 2022, ArXiv.