Text-to-Image Generation Grounded by Fine-Grained User Attention
暂无分享,去创建一个
Honglak Lee | Jason Baldridge | Yinfei Yang | Jing Yu Koh | Honglak Lee | Yinfei Yang | Jason Baldridge
[1] Xiaogang Wang,et al. Learning to Predict Layout-to-image Conditional Convolutions for Semantic Image Synthesis , 2019, NeurIPS.
[2] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.
[3] David J. Fleet,et al. VSE++: Improving Visual-Semantic Embeddings with Hard Negatives , 2017, BMVC.
[4] Jordi Pont-Tuset,et al. Connecting Vision and Language with Localized Narratives , 2019, ECCV.
[5] Mauro Di Manzo,et al. Natural Language Input For Scene Generation , 1983, EACL.
[6] Seunghoon Hong,et al. Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[7] Bernt Schiele,et al. Generative Adversarial Text to Image Synthesis , 2016, ICML.
[8] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9] Zhe Gan,et al. AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[10] Jan Kautz,et al. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[12] Li Fei-Fei,et al. Image Generation from Scene Graphs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[13] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[14] Andrew Zisserman,et al. Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.
[15] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..
[16] Dimitris N. Metaxas,et al. StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[17] Ray Kurzweil,et al. Learning Cross-Lingual Sentence Representations via a Multi-task Dual-Encoder Model , 2019, RepL4NLP@ACL.
[18] Lei Zhang,et al. Object-Driven Text-To-Image Synthesis via Adversarial Training , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[20] Yoshua Bengio,et al. Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[21] Gabriel Ilharco,et al. Large-Scale Representation Learning from Visually Grounded Untranscribed Speech , 2019, CoNLL.
[22] Taesung Park,et al. Semantic Image Synthesis With Spatially-Adaptive Normalization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[23] Xiaogang Wang,et al. PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph , 2019, NeurIPS.
[24] Daniel Gillick,et al. End-to-End Retrieval in Continuous Space , 2018, ArXiv.
[25] Jason Baldridge,et al. Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO , 2020, EACL.
[26] PietraVincent J. Della,et al. The mathematics of statistical machine translation , 1993 .
[27] Douglas Eck,et al. A Neural Representation of Sketch Drawings , 2017, ICLR.
[28] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[29] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.
[30] Vicente Ordonez,et al. Text2Scene: Generating Compositional Scenes From Textual Descriptions , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[32] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[33] Yoshua Bengio,et al. ChatPainter: Improving Text to Image Generation using Dialogue , 2018, ICLR.
[34] Radu Soricut,et al. Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning , 2018, ACL.
[35] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Ray Kurzweil,et al. Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax , 2019, IJCAI.
[37] Rishi Sharma,et al. A Note on the Inception Score , 2018, ArXiv.
[38] Vittorio Ferrari,et al. COCO-Stuff: Thing and Stuff Classes in Context , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[39] Richard Sproat,et al. WordsEye: an automatic text-to-scene conversion system , 2001, SIGGRAPH.
[40] Jason Baldridge,et al. Type-Supervised Hidden Markov Models for Part-of-Speech Tagging with Incomplete Tag Dictionaries , 2012, EMNLP.