T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation
暂无分享,去创建一个
Xihui Liu | Zhenguo Li | Enze Xie | Kaiyue Sun | Kaiyi Huang
[1] C. Ding,et al. X-IQE: eXplainable Image Quality Evaluation for Text-to-Image Generation with Visual Large Language Models , 2023, ArXiv.
[2] William Yang Wang,et al. LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation , 2023, NeurIPS.
[3] Mohamed Elhoseiny,et al. MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models , 2023, ArXiv.
[4] T. Zhang,et al. RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment , 2023, ArXiv.
[5] Mohamed Elhoseiny,et al. HRS-Bench: Holistic, Reliable and Scalable Benchmark for Text-to-Image Models , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[6] Yang Zhang,et al. Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis , 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV).
[7] A. Vedaldi,et al. Training-Free Layout Control with Cross-Attention Guidance , 2023, ArXiv.
[8] S. Savarese,et al. HIVE: Harnessing Human Feedback for Instructional Visual Editing , 2023, ArXiv.
[9] P. Abbeel,et al. Aligning Text-to-Image Models using Human Feedback , 2023, ArXiv.
[10] Lior Wolf,et al. Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models , 2023, ACM Trans. Graph..
[11] S. Savarese,et al. BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models , 2023, ICML.
[12] W. Freeman,et al. Muse: Text-To-Image Generation via Masked Generative Transformers , 2023, ICML.
[13] William Yang Wang,et al. Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis , 2022, ICLR.
[14] Vitali Petsiuk. Human Evaluation of Text-to-Image Models on a Multi-Task Benchmark , 2022, ArXiv.
[15] J. Tenenbaum,et al. Compositional Visual Generation with Composable Diffusion Models , 2022, ECCV.
[16] David J. Fleet,et al. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding , 2022, NeurIPS.
[17] Prafulla Dhariwal,et al. Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.
[18] Martin Renqiang Min,et al. StyleT2I: Toward Compositional and High-Fidelity Text-to-Image Synthesis , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Yaniv Taigman,et al. Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors , 2022, ECCV.
[20] S. Hoi,et al. BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation , 2022, ICML.
[21] B. Ommer,et al. High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Prafulla Dhariwal,et al. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models , 2021, ICML.
[23] Yelong Shen,et al. LoRA: Low-Rank Adaptation of Large Language Models , 2021, ICLR.
[24] David J. Fleet,et al. Cascaded Diffusion Models for High Fidelity Image Generation , 2021, J. Mach. Learn. Res..
[25] Prafulla Dhariwal,et al. Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.
[26] Ronan Le Bras,et al. CLIPScore: A Reference-free Evaluation Metric for Image Captioning , 2021, EMNLP.
[27] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[28] Philipp Krähenbühl,et al. Simple Multi-dataset Detection , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[29] Alec Radford,et al. Zero-Shot Text-to-Image Generation , 2021, ICML.
[30] Prafulla Dhariwal,et al. Improved Denoising Diffusion Probabilistic Models , 2021, ICML.
[31] Jing Yu Koh,et al. Cross-Modal Contrastive Learning for Text-to-Image Generation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[32] Jian Sun,et al. Objects365: A Large-Scale, High-Quality Dataset for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[33] Wei Chen,et al. DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-To-Image Synthesis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Zhe Gan,et al. AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[35] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[36] Peter Kontschieder,et al. The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[37] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.
[38] Dimitris N. Metaxas,et al. StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[39] Bernt Schiele,et al. Learning What and Where to Draw , 2016, NIPS.
[40] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.
[41] Bernt Schiele,et al. Generative Adversarial Text to Image Synthesis , 2016, ICML.
[42] Aaron C. Courville,et al. Generative Adversarial Networks , 2014, 1406.2661.
[43] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[44] Pietro Perona,et al. The Caltech-UCSD Birds-200-2011 Dataset , 2011 .
[45] Andrew Zisserman,et al. Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.
[46] Mohit Bansal,et al. DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers , 2022, ArXiv.
[47] Trevor Darrell,et al. Benchmark for Compositional Text-to-Image Synthesis , 2021, NeurIPS Datasets and Benchmarks.
[48] Sixth Indian Conference on Computer Vision, Graphics & Image Processing, ICVGIP 2008, Bhubaneswar, India, 16-19 December 2008 , 2008, ICVGIP.