Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models
暂无分享,去创建一个
Zekang Chen | H. Lu | Chen Chen | Haonan Lu | Jian Ma | Jiancang Ma | Ruichen Wang | Xiaodong Lin | Ruichen Wang | Zekang Chen | Xiaodong Lin | Chen Chen
[1] Yang Zhang,et al. Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[2] A. Vedaldi,et al. Training-Free Layout Control with Cross-Attention Guidance , 2023, ArXiv.
[3] Dimitris N. Metaxas,et al. SVDiff: Compact Parameter Space for Diffusion Fine-Tuning , 2023, ArXiv.
[4] Jun-Juan Zhu,et al. Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection , 2023, ECCV.
[5] Jingren Zhou,et al. Cones: Concept Neurons in Diffusion Models for Customized Generation , 2023, ICML.
[6] J. P. Lewis,et al. Directed Diffusion: Direct Control of Object Placement through Attention Guidance , 2023, ArXiv.
[7] Maneesh Agrawala,et al. Adding Conditional Control to Text-to-Image Diffusion Models , 2023, ArXiv.
[8] Á. Jiménez. Mixture of Diffusers for scene composition and high resolution image generation , 2023, ArXiv.
[9] Lior Wolf,et al. Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models , 2023, ArXiv.
[10] Yong Jae Lee,et al. GLIGEN: Open-Set Grounded Text-to-Image Generation , 2023, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Radu Tudor Ionescu,et al. Diffusion Models in Vision: A Survey , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[12] William Yang Wang,et al. Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis , 2022, ICLR.
[13] Dong Huk Park,et al. Shape-Guided Diffusion with Inside-Outside Attention , 2022, ArXiv.
[14] Bryan Catanzaro,et al. eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers , 2022, ArXiv.
[15] Ming-Hsuan Yang,et al. Diffusion Models: A Comprehensive Survey of Methods and Applications , 2022, ACM Computing Surveys.
[16] J. Tenenbaum,et al. Compositional Visual Generation with Composable Diffusion Models , 2022, ECCV.
[17] David J. Fleet,et al. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding , 2022, NeurIPS.
[18] Prafulla Dhariwal,et al. Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.
[19] B. Ommer,et al. High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Fu-En Yang,et al. LayoutTransformer: Scene Layout Generation with Conceptual and Spatial Diversity , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[22] Alec Radford,et al. Zero-Shot Text-to-Image Generation , 2021, ICML.
[23] B. Ommer,et al. Taming Transformers for High-Resolution Image Synthesis , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.
[25] Silvio Savarese,et al. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.
[27] Carl Doersch,et al. Tutorial on Variational Autoencoders , 2016, ArXiv.
[28] Andrew Y. Ng,et al. End-to-End People Detection in Crowded Scenes , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[30] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.