UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild
暂无分享,去创建一个
Juan Carlos Niebles | S. Savarese | S. Ermon | Yihao Feng | Ran Xu | Caiming Xiong | Haiquan Wang | Ning Yu | Can Qin | Yun Fu | Yingbo Zhou | Xinyi Yang | Shu Zhang
[1] Yifan Jiang,et al. In-Context Learning Unlocked for Diffusion Models , 2023, ArXiv.
[2] Jia-Bin Huang,et al. AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head , 2023, AAAI.
[3] Kai-Wei Chang,et al. CoBIT: A Contrastive Bi-directional Image-Text Generation Model , 2023, ArXiv.
[4] Maneesh Agrawala,et al. Adding Conditional Control to Text-to-Image Diffusion Models , 2023, ArXiv.
[5] Tara N. Sainath,et al. Efficient Domain Adaptation for Speech Foundation Models , 2023, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] P. Abbeel,et al. Learning Universal Policies via Text-Guided Video Generation , 2023, NeurIPS.
[7] S. Savarese,et al. BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models , 2023, ArXiv.
[8] E. Theodorou,et al. I2SB: Image-to-Image Schrödinger Bridge , 2023, ArXiv.
[9] Akash Gokul,et al. EDICT: Exact Diffusion Inversion via Coupled Transformations , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Andrew M. Dai,et al. Scaling Instruction-Finetuned Language Models , 2022, ArXiv.
[11] Ludwig Schmidt,et al. LAION-5B: An open large-scale dataset for training next generation image-text models , 2022, NeurIPS.
[12] Ricky T. Q. Chen,et al. Flow Matching for Generative Modeling , 2022, ICLR.
[13] Chengyue Gong,et al. Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow , 2022, ICLR.
[14] Alexei A. Efros,et al. Visual Prompting via Image Inpainting , 2022, NeurIPS.
[15] Qiang Liu,et al. Let us Build Bridges: Understanding and Extending Diffusion Generative Models , 2022, ArXiv.
[16] J. Tenenbaum,et al. Prompt-to-Prompt Image Editing with Cross Attention Control , 2022, ICLR.
[17] Jonathan Ho. Classifier-Free Diffusion Guidance , 2022, ArXiv.
[18] Fang Wen,et al. Pretraining is All You Need for Image-to-Image Translation , 2022, ArXiv.
[19] David J. Fleet,et al. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding , 2022, NeurIPS.
[20] Prafulla Dhariwal,et al. Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.
[21] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[22] S. Levine,et al. Do As I Can, Not As I Say: Grounding Language in Robotic Affordances , 2022, CoRL.
[23] S. Ermon,et al. Dual Diffusion Implicit Bridges for Image-to-Image Translation , 2022, ICLR.
[24] Ryan J. Lowe,et al. Training language models to follow instructions with human feedback , 2022, NeurIPS.
[25] L. Gool,et al. RePaint: Inpainting using Denoising Diffusion Probabilistic Models , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Yali Wang,et al. UniFormer: Unifying Convolution and Self-attention for Visual Recognition , 2022, ArXiv.
[27] B. Ommer,et al. High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Prafulla Dhariwal,et al. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models , 2021, ICML.
[29] David J. Fleet,et al. Palette: Image-to-Image Diffusion Models , 2021, SIGGRAPH.
[30] S. Ermon,et al. SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations , 2021, ICLR.
[31] David J. Fleet,et al. Cascaded Diffusion Models for High Fidelity Image Generation , 2021, J. Mach. Learn. Res..
[32] Konrad Schindler,et al. Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[33] Quoc V. Le,et al. Multi-Task Self-Training for Learning General Representations , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[34] Prafulla Dhariwal,et al. Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.
[35] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[36] Abhishek Kumar,et al. Score-Based Generative Modeling through Stochastic Differential Equations , 2020, ICLR.
[37] Jiaming Song,et al. Denoising Diffusion Implicit Models , 2020, ICLR.
[38] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.
[39] Hong-Yuan Mark Liao,et al. YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.
[40] Tero Karras,et al. Analyzing and Improving the Image Quality of StyleGAN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[42] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[43] Yang Song,et al. Generative Modeling by Estimating Gradients of the Data Distribution , 2019, NeurIPS.
[44] Andrew J. Davison,et al. End-To-End Multi-Task Learning With Attention , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[45] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[46] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[47] Leonidas J. Guibas,et al. Taskonomy: Disentangling Task Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[48] Alexei A. Efros,et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[49] Bolei Zhou,et al. Semantic Understanding of Scenes Through the ADE20K Dataset , 2016, International Journal of Computer Vision.
[50] Geoffrey E. Hinton,et al. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.
[51] Yaser Sheikh,et al. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[52] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[53] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[54] Saining Xie,et al. Holistically-Nested Edge Detection , 2015, International Journal of Computer Vision.
[55] Surya Ganguli,et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.
[56] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[57] John F. Canny,et al. A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.