TextDiffuser: Diffusion Models as Text Painters
暂无分享,去创建一个
Furu Wei | Qifeng Chen | Yupan Huang | Tengchao Lv | Lei Cui | Jingye Chen
[1] T. Jaakkola,et al. Towards Coherent Image Inpainting Using Denoising Diffusion Implicit Models , 2023, ICML.
[2] Xu Tan,et al. HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace , 2023, ArXiv.
[3] Xiangyang Xue,et al. Weakly-Supervised Text Instance Segmentation , 2023, ArXiv.
[4] Yijuan Lu,et al. Diffusion-based Document Layout Generation , 2023, ICDAR.
[5] Ariel Shamir,et al. Word-As-Image for Semantic Typography , 2023, ACM Trans. Graph..
[6] Li Dong,et al. Language Is Not All You Need: Aligning Perception with Language Models , 2023, NeurIPS.
[7] Maneesh Agrawala,et al. Adding Conditional Control to Text-to-Image Diffusion Models , 2023, ArXiv.
[8] W. Freeman,et al. Muse: Text-To-Image Generation via Masked Generative Transformers , 2023, ICML.
[9] Y. Gan,et al. OCR-RTPS: an OCR-based real-time positioning system for the valet parking , 2022, Applied Intelligence.
[10] Issam H. Laradji,et al. OCR-VQGAN: Taming Text-within-Image Generation , 2022, 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).
[11] Kai Chen,et al. Real-time Scene Text Detection with Differentiable Binarization , 2019, AAAI.
[12] H. Lu,et al. GlyphDraw: Learning to Draw Chinese Characters in Image Synthesis Models Coherently , 2023, ArXiv.
[13] Daniel H Garrette,et al. Character-Aware Models Improve Visual Text Rendering , 2022, ACL.
[14] Juhua Liu,et al. Diff-Font: Diffusion Model for Robust One-Shot Font Generation , 2022, ArXiv.
[15] Qingfeng Tan,et al. Exploring Stroke-Level Modifications for Scene Text Editing , 2022, AAAI.
[16] Fang Wen,et al. Paint by Example: Exemplar-based Image Editing with Diffusion Models , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Bryan Catanzaro,et al. eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers , 2022, ArXiv.
[18] Hua Wu,et al. ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Ludwig Schmidt,et al. LAION-5B: An open large-scale dataset for training next generation image-text models , 2022, NeurIPS.
[20] Ming-Hsuan Yang,et al. Diffusion Models: A Comprehensive Survey of Methods and Applications , 2022, ACM Computing Surveys.
[21] Yuanzhen Li,et al. DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Amit H. Bermano,et al. An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion , 2022, ICLR.
[23] Jonathan Ho. Classifier-Free Diffusion Guidance , 2022, ArXiv.
[24] Rowel Atienza,et al. Scene Text Recognition with Permuted Autoregressive Sequence Models , 2022, ECCV.
[25] Jianqi Ma,et al. BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[26] David J. Fleet,et al. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding , 2022, NeurIPS.
[27] Shenggao Zhu,et al. Look Closer to Supervise Better: One-Shot Font Generation via Component-Based Discriminator , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Miaosen Wang,et al. C3-STISR: Scene Text Image Super-resolution with Triple Clues , 2022, IJCAI.
[29] Prafulla Dhariwal,et al. Hierarchical Text-Conditional Image Generation with CLIP Latents , 2022, ArXiv.
[30] Lei Zhang,et al. A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[31] L. Gool,et al. RePaint: Inpainting using Denoising Diffusion Probabilistic Models , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[32] B. Ommer,et al. High-Resolution Image Synthesis with Latent Diffusion Models , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Prafulla Dhariwal,et al. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models , 2021, ICML.
[34] Jong-Chul Ye,et al. Come-Closer-Diffuse-Faster: Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Fang Wen,et al. Vector Quantized Diffusion Model for Text-to-Image Synthesis , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[36] D. Lischinski,et al. Blended Diffusion for Text-driven Editing of Natural Images , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[37] David J. Fleet,et al. Palette: Image-to-Image Diffusion Models , 2021, SIGGRAPH.
[38] David J. Fleet,et al. Cascaded Diffusion Models for High Fidelity Image Generation , 2021, J. Mach. Learn. Res..
[39] B. Rosenhahn,et al. Text to Image Generation with Semantic-Spatial Aware GAN , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Shenggao Zhu,et al. Detecting Tampered Scene Text in the Wild , 2022, ECCV.
[41] Xin Jiang,et al. Wukong: 100 Million Large-scale Chinese Cross-modal Pre-training Dataset and A Foundation Framework , 2022, ArXiv.
[42] Jenia Jitsev,et al. LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs , 2021, ArXiv.
[43] Yupan Huang,et al. Unifying Multimodal Transformer for Bi-directional Image and Text Generation , 2021, ACM Multimedia.
[44] Wataru Shimoda,et al. De-rendering Stylized Texts , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[45] Cha Zhang,et al. TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models , 2021, AAAI.
[46] Sungrae Park,et al. RewriteNet: Reliable Scene Text Editing with Implicit Decomposition of Text Contents and Styles , 2021, 2107.11041.
[47] Xiangyang Xue,et al. Scene Text Telescope: Text-Focused Scene Image Super-Resolution , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[48] Prafulla Dhariwal,et al. Diffusion Models Beat GANs on Image Synthesis , 2021, NeurIPS.
[49] C. Miao,et al. Diverse Image Inpainting with Bidirectional and Autoregressive Transformers , 2021, ACM Multimedia.
[50] Ronan Le Bras,et al. CLIPScore: A Reference-free Evaluation Metric for Image Captioning , 2021, EMNLP.
[51] Hyunjung Shim,et al. Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[52] Jing Liao,et al. High-Fidelity Pluralistic Image Completion with Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).
[53] Dong Liu,et al. Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[54] Yongdong Zhang,et al. Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[55] Yong Xu,et al. Mask-guided GAN for robust text editing in the scene , 2021, Neurocomputing.
[56] Ilya Sutskever,et al. Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.
[57] Alec Radford,et al. Zero-Shot Text-to-Image Generation , 2021, ICML.
[58] Brian L. Price,et al. Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[59] Jiaming Song,et al. Denoising Diffusion Implicit Models , 2020, ICLR.
[60] Yanning Zhang,et al. A Robust Attentional Framework for License Plate Recognition in the Wild , 2020, IEEE Transactions on Intelligent Transportation Systems.
[61] Lianwen Jin,et al. EraseNet: End-to-End Text Removal in the Wild , 2020, IEEE Transactions on Image Processing.
[62] Wei Huang,et al. Rethinking Image Inpainting via a Mutual Encoder-Decoder with Feature Equalizations , 2020, ECCV.
[63] Alessandro Achille,et al. Layout Generation and Completion with Self-attention , 2020, ArXiv.
[64] Pieter Abbeel,et al. Denoising Diffusion Probabilistic Models , 2020, NeurIPS.
[65] Lei Zhao,et al. UCTGAN: Diverse Image Inpainting Based on Unsupervised Cross-Space Translation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[66] Xiang Bai,et al. Scene Text Image Super-Resolution in the Wild , 2020, ECCV.
[67] Errui Ding,et al. Towards Accurate Scene Text Recognition With Semantic Reasoning Networks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[68] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[69] Omri Ben-Eliezer,et al. READ: Recursive Autoencoders for Document Layout Generation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[70] Yingli Tian,et al. Unambiguous Scene Text Segmentation With Referring Expression Comprehension , 2020, IEEE Transactions on Image Processing.
[71] Thomas H. Li,et al. StructureFlow: Image Inpainting via Structure-Aware Appearance Flow , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[72] Liang Wu,et al. Editing Text in the Wild , 2019, ACM Multimedia.
[73] Jiawei He,et al. LayoutVAE: Stochastic Scene Layout Generation From a Label Set , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[74] Baining Guo,et al. Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[75] Jing Zhang,et al. MirrorGAN: Learning Text-To-Image Generation by Redescription , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[76] Jianfei Cai,et al. Pluralistic Image Completion , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[77] Tingfa Xu,et al. LayoutGAN: Generating Graphic Layouts with Wireframe Discriminators , 2019, ICLR.
[78] Xiang Li,et al. Shape Robust Text Detection With Progressive Scale Expansion Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[79] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[80] Xiangyang Xue,et al. Arbitrary-Oriented Scene Text Detection via Rotation Proposals , 2017, IEEE Transactions on Multimedia.
[81] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium , 2017, NIPS.
[82] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[83] Shuchang Zhou,et al. EAST: An Efficient and Accurate Scene Text Detector , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[84] Safaa S. Omran,et al. Iraqi car license plate recognition using OCR , 2017, 2017 Annual Conference on New Trends in Information & Communications Technology Applications (NTICT).
[85] Dimitris N. Metaxas,et al. StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).
[86] Xiang Bai,et al. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[87] Bernt Schiele,et al. Generative Adversarial Text to Image Synthesis , 2016, ICML.
[88] Alexei A. Efros,et al. Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[89] A. Vedaldi,et al. Synthetic Data for Text Localisation in Natural Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[90] Tianqi Chen,et al. Training Deep Nets with Sublinear Memory Cost , 2016, ArXiv.
[91] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.
[92] Surya Ganguli,et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics , 2015, ICML.
[93] Markus Schreiber,et al. Detecting symbols on road surface for mapping and localization using OCR , 2014, 17th International IEEE Conference on Intelligent Transportation Systems (ITSC).
[94] Andrew Zisserman,et al. Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition , 2014, ArXiv.
[95] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[96] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[97] Guillermo Sapiro,et al. Simultaneous structure and texture image inpainting , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..
[98] Guillermo Sapiro,et al. Filling-in by joint interpolation of vector fields and gray levels , 2001, IEEE Trans. Image Process..
[99] Guillermo Sapiro,et al. Image inpainting , 2000, SIGGRAPH.
[100] Mehdi Hatamian,et al. Optical character recognition by the method of moments , 1987 .
[101] J. M. White,et al. Image Thresholding for Optical Character Recognition and Other Applications Requiring Character Image Extraction , 1983, IBM J. Res. Dev..