Zero-Shot Text-to-Image Generation
-
爱吃猫的鱼0于 2021年9月28日 18:02
Alec Radford | Ilya Sutskever | Mark Chen | Scott Gray | A. Ramesh | Mikhail Pavlov | Gabriel Goh | Chelsea Voss | S. Gray | I. Sutskever
[1] Geoffrey E. Hinton. Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Systems , 1991 .
[2] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .
[3] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[4] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[5] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[6] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[7] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.
[8] Alex Graves,et al. DRAW: A Recurrent Neural Network For Image Generation , 2015, ICML.
[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[10] Ruslan Salakhutdinov,et al. Generating Images from Captions with Attention , 2015, ICLR.
[11] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[12] David A. Shamma,et al. YFCC100M , 2015, Commun. ACM.
[13] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[14] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.
[15] Bernt Schiele,et al. Generative Adversarial Text to Image Synthesis , 2016, ICML.
[16] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[17] Bernt Schiele,et al. Learning What and Where to Draw , 2016, NIPS.
[18] Samy Bengio,et al. Generating Sentences from a Continuous Space , 2015, CoNLL.
[19] Yoshua Bengio,et al. Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[21] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.
[22] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[23] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[24] Xi Chen,et al. PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications , 2017, ICLR.
[25] Chen Sun,et al. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[26] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Christopher Burgess,et al. beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework , 2016, ICLR 2016.
[28] Xin Wang,et al. Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks , 2017, NIPS.
[29] Dimitris N. Metaxas,et al. StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[30] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.
[31] Hao Wu,et al. Mixed Precision Training , 2017, ICLR.
[32] Zhe Gan,et al. AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[33] Radu Soricut,et al. Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning , 2018, ACL.
[34] Dan Klein,et al. Learning with Latent Language , 2017, NAACL.
[35] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[36] Xiaogang Wang,et al. StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[37] Ali Razavi,et al. Generating Diverse High-Fidelity Images with VQ-VAE-2 , 2019, NeurIPS.
[38] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[39] Wei Chen,et al. DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-To-Image Synthesis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Martin Jaggi,et al. PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization , 2019, NeurIPS.
[41] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[42] Lei Zhang,et al. Object-Driven Text-To-Image Synthesis via Adversarial Training , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[43] Swagath Venkataramani,et al. Ultra-Low Precision 4-bit Training of Deep Neural Networks , 2020, NeurIPS.
[44] Ilya Sutskever,et al. Jukebox: A Generative Model for Music , 2020, ArXiv.
[45] Samyam Rajbhandari,et al. ZeRO: Memory Optimization Towards Training A Trillion Parameter Models , 2019, ArXiv.
[46] Jiawei Han,et al. Understanding the Difficulty of Training Transformers , 2020, EMNLP.
[47] N. Sebe,et al. DF-GAN: Deep Fusion Generative Adversarial Networks for Text-to-Image Synthesis , 2020, ArXiv.
[48] Mark Chen,et al. Generative Pretraining From Pixels , 2020, ICML.
[49] Ivan Provilkov,et al. BPE-Dropout: Simple and Effective Subword Regularization , 2019, ACL.
[50] Klaus Greff,et al. On the Binding Problem in Artificial Neural Networks , 2020, ArXiv.
[51] Jiasen Lu,et al. X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers , 2020, EMNLP.
[52] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[53] Honglak Lee,et al. Text-to-Image Generation Grounded by Fine-Grained User Attention , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).