ProRes: Exploring Degradation-aware Visual Prompt for Universal Image Restoration

Image restoration aims to reconstruct degraded images, e.g., denoising or deblurring. Existing works focus on designing task-specific methods and there are inadequate attempts at universal methods. However, simply unifying multiple tasks into one universal architecture suffers from uncontrollable and undesired predictions. To address those issues, we explore prompt learning in universal architectures for image restoration tasks. In this paper, we present Degradation-aware Visual Prompts, which encode various types of image degradation, e.g., noise and blur, into unified visual prompts. These degradation-aware prompts provide control over image processing and allow weighted combinations for customized image restoration. We then leverage degradation-aware visual prompts to establish a controllable and universal model for image restoration, called ProRes, which is applicable to an extensive range of image restoration tasks. ProRes leverages the vanilla Vision Transformer (ViT) without any task-specific designs. Furthermore, the pre-trained ProRes can easily adapt to new tasks through efficient prompt tuning with only a few images. Without bells and whistles, ProRes achieves competitive performance compared to task-specific methods and experiments can demonstrate its ability for controllable restoration and adaptation for new tasks. The code and models will be released in \url{https://github.com/leonmakise/ProRes}.

[1]  Andrew Gilbert,et al.  UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose Transfer , 2023, ArXiv.

[2]  Chunhua Shen,et al.  Images Speak in Images: A Generalist Painter for In-Context Visual Learning , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Humphrey Shi,et al.  OneFormer: One Transformer to Rule Universal Image Segmentation , 2022, 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Han Hu,et al.  Could Giant Pretrained Image Models Extract Universal Representations? , 2022, NeurIPS.

[5]  Alexei A. Efros,et al.  Visual Prompting via Image Inpainting , 2022, NeurIPS.

[6]  Xiaogang Wang,et al.  Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs , 2022, NeurIPS.

[7]  Peng Hu,et al.  All-In-One Image Restoration for Unknown Corruption , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Wei-Ting Chen,et al.  Learning Multiple Adverse Weather Removal via Two-stage Knowledge Learning and Multi-contrastive Regularization: Toward a Unified Model , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Jian Sun,et al.  Simple Baselines for Image Restoration , 2022, ECCV.

[10]  P. Milanfar,et al.  MAXIM: Multi-Axis MLP for Image Processing , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  A. Schwing,et al.  Masked-attention Mask Transformer for Universal Image Segmentation , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Xizhou Zhu,et al.  Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Syed Waqas Zamir,et al.  Restormer: Efficient Transformer for High-Resolution Image Restoration , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Ross B. Girshick,et al.  Masked Autoencoders Are Scalable Vision Learners , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Yihong Gong,et al.  Towards A Universal Model for Cross-Dataset Crowd Counting , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Olivier J. H'enaff,et al.  Perceiver IO: A General Architecture for Structured Inputs & Outputs , 2021, ICLR.

[17]  Junnan Li,et al.  Align before Fuse: Vision and Language Representation Learning with Momentum Distillation , 2021, NeurIPS.

[18]  Jianmin Bao,et al.  Uformer: A General U-Shaped Transformer for Image Restoration , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Andrew Zisserman,et al.  Perceiver: General Perception with Iterative Attention , 2021, ICML.

[20]  Ilya Sutskever,et al.  Learning Transferable Visual Models From Natural Language Supervision , 2021, ICML.

[21]  Ling Shao,et al.  Multi-Stage Progressive Image Restoration , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Angel Domingo Sappa,et al.  MPRNet: Multi-Path Residual Network for Lightweight Image Super Resolution , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[23]  S. Gelly,et al.  An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , 2020, ICLR.

[24]  Radu Timofte,et al.  NH-HAZE: An Image Dehazing Benchmark with Non-Homogeneous Hazy and Haze-Free Images , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[25]  Syed Waqas Zamir,et al.  Learning Enriched Features for Real Image Restoration and Enhancement , 2020, ECCV.

[26]  Stella X. Yu,et al.  Open Compound Domain Adaptation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Xiaoou Tang,et al.  Path-Restore: Learning Network Path Selection for Image Restoration , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Chen Wei,et al.  Deep Retinex Decomposition for Low-Light Enhancement , 2018, BMVC.

[29]  Stephen Lin,et al.  A High-Quality Denoising Dataset for Smartphone Cameras , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Vishal M. Patel,et al.  Density-Aware Single Image De-raining Using a Multi-stream Dense Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[32]  Tae Hyun Kim,et al.  Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Shuicheng Yan,et al.  Deep Joint Rain Detection and Removal from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[35]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[36]  Sylvain Paris,et al.  Learning photographic global tonal adjustment with a database of input / output image pairs , 2011, CVPR 2011.

[37]  Michel Barlaud,et al.  Two deterministic half-quadratic regularization algorithms for computed imaging , 1994, Proceedings of 1st International Conference on Image Processing.

[38]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.