论文信息 - L-CoIns: Language-based Colorization With Instance Awareness

L-CoIns: Language-based Colorization With Instance Awareness

in distinguishing instances corresponding to the same object words. In this paper, we pro-pose a transformer-based framework to automatically aggregate similar image patches and achieve instance awareness without any additional knowledge. By applying our presented luminance augmentation and counter-color loss to break down the statistical correlation between luminance and color words, our model is driven to synthesize colors with better descriptive consistency. We further collect a dataset to provide distinctive visual characteristics and detailed language descriptions for multiple instances in the

[1] Menghan Xia,et al. Disentangled Image Colorization via Global Anchors , 2022, ACM Trans. Graph..

[2] Jing Liao,et al. UniColor: A Unified Framework for Multi-Modal Colorization with Transformer , 2022, ACM Trans. Graph..

[3] Olga Russakovsky,et al. SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding , 2022, ECCV.

[4] Sunghyun Cho,et al. BigColor: Colorization using a Generative Color Prior for Natural Images , 2022, ECCV.

[5] Boxin Shi,et al. L-CoDe: Language-Based Colorization Using Color-Object Decoupled Conditions , 2022, AAAI.

[6] Jiaya Jia,et al. MAT: Mask-Aware Transformer for Large Hole Image Inpainting , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Shalini De Mello,et al. GroupViT: Semantic Segmentation Emerges from Text Supervision , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Philip H. S. Torr,et al. LAVT: Language-Aware Vision Transformer for Referring Image Segmentation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Chao Dong,et al. Semantic-Sparse Colorization Network for Deep Exemplar-based Colorization , 2021, ECCV.

[10] Jifeng Dai,et al. FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[11] Luc Van Gool,et al. SwinIR: Image Restoration Using Swin Transformer , 2021, 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).

[12] Yu Li,et al. Towards Vivid and Diverse Image Colorization with Generative Color Prior , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[13] Xudong Jiang,et al. Vision-Language Transformer and Query Generation for Referring Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[14] Chang Zhou,et al. CogView: Mastering Text-to-Image Generation via Transformers , 2021, NeurIPS.

[15] Cordelia Schmid,et al. Segmenter: Transformer for Semantic Segmentation , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[16] Yann LeCun,et al. MDETR - Modulated Detection for End-to-End Multi-Modal Understanding , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[17] Wengang Zhou,et al. TransVG: End-to-End Visual Grounding with Transformers , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[18] Alec Radford,et al. Zero-Shot Text-to-Image Generation , 2021, ICML.

[19] Nal Kalchbrenner,et al. Colorization Transformer , 2021, ICLR.

[20] Wonjae Kim,et al. ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision , 2021, ICML.

[21] Tao Xiang,et al. Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Wen Gao,et al. Pre-Trained Image Processing Transformer , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23] S. Gelly,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[24] Bin Li,et al. Deformable DETR: Deformable Transformers for End-to-End Object Detection , 2020, ICLR.

[25] Yun Sheng,et al. Stylization-Based Architecture for Fast Deep Exemplar Colorization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Nicolas Usunier,et al. End-to-End Object Detection with Transformers , 2020, ECCV.

[27] Hung-Kuo Chu,et al. Instance-Aware Image Colorization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Jordi Pont-Tuset,et al. Connecting Vision and Language with Localized Narratives , 2019, ECCV.

[29] Chengying Gao,et al. Language-based colorization of scene sketches , 2019, ACM Trans. Graph..

[30] C. Ballester,et al. ChromaGAN: Adversarial Picture Colorization with Semantic Class Distribution , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[31] Ullrich Köthe,et al. Guided Image Generation with Conditional Invertible Neural Networks , 2019, ArXiv.

[32] Ling Shao,et al. Pixelated Semantic Colorization , 2019, International Journal of Computer Vision.

[33] Yanping Xie,et al. Language-Guided Image Colorization , 2018 .

[34] Ling Shao,et al. Pixel-level Semantics Guided Image Colorization , 2018, BMVC.

[35] Dongdong Chen,et al. Deep exemplar-based colorization , 2018, ACM Trans. Graph..

[36] Larry S. Davis,et al. Learning to Color from Language , 2018, NAACL.

[37] Alexei A. Efros,et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38] Zhe Gan,et al. AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39] Xiaodong Liu,et al. Language-Based Image Editing with Recurrent Attentive Models , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40] Oriol Vinyals,et al. Neural Discrete Representation Learning , 2017, NIPS.

[41] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[42] Alexei A. Efros,et al. Real-time user-guided image colorization with learned deep priors , 2017, ACM Trans. Graph..

[43] Yong Yu,et al. Unsupervised Diverse Colorization via Generative Adversarial Networks , 2017, ECML/PKDD.

[44] Vittorio Ferrari,et al. COCO-Stuff: Thing and Stuff Classes in Context , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45] Aditya Deshpande,et al. Learning Diverse Image Colorization , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46] Fisher Yu,et al. Scribbler: Controlling Deep Image Synthesis with Sketch and Color , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47] Alexei A. Efros,et al. Colorful Image Colorization , 2016, ECCV.

[48] Gregory Shakhnarovich,et al. Learning Representations for Automatic Colorization , 2016, ECCV.

[49] Bin Sheng,et al. Deep Colorization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[50] Mohammed Ghanbari,et al. Scope of validity of PSNR in image/video quality assessment , 2008 .

[51] Eero P. Simoncelli,et al. Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[52] Boxin Shi,et al. L-CoDer: Language-Based Colorization with Color-Object Decoupling Transformer , 2022, ECCV.

[53] Jimeng Sun,et al. CT2: Colorization Transformer via Color Tokens , 2022, ECCV.

[54] Ying Tai,et al. ColorFormer: Image Colorization via Color Memory Assisted Hybrid-Attention Transformer , 2022, ECCV.

[55] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[56] Sepp Hochreiter,et al. GANs Trained by a Two Time-Scale Update Rule Converge to a Nash Equilibrium , 2017, ArXiv.

[57] Edgar Simo-Serra,et al. Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classiﬁcation , 2016 .