Supplementary Material of Blind Image Decomposition

1.1 Training details, running time, and model size BIDeN. We train BIDeN using a Tesla P100-PCIE-16GB GPU. The GPU driver version is 440.64.00 and the CUDA version is 10.2. We initialize weights using Xavier initialization [6]. For Task I (Mixed image decomposition across multiple domains), BIDeN (2) to BIDeN (8) takes approximately 37 hours, 50 hours, 61 hours, 71 hours, 82 hours, 91 hours, and 101 hours training time. For Task II.A (Real-scenario deraining in driving), the runtime of BIDeN is approximately 96 hours. However, BIDeN is required to perform additional mask reconstruction and source prediction tasks. By removing additional tasks from the training of BIDeN, the GPU hours can drop to 45 hours. Double-DIP. We follow the default training setting of Double-DIP [5]. We use the official PyTorch implementation (link). We train a single image for 8000 iterations on a Tesla P100-PCIE-16GB GPU, the GPU driver version is 415.27 and the CUDA version is 10.0. The runtime for a single input image is approximately 20 minutes. DAD. We follow the default training setting (Epoch 200, batch size 2, image crop size 256) of DAD [19]. Experiments are based on the official PyTorch implementation (link). We train DAD on a Tesla P100-PCIE-16GB GPU. The GPU driver version is 440.64.00 and the CUDA version is 10.2. DAD takes 13 hours of runtime. MPRNet. We follow the default training setting (Epoch 250, batch size 16, image crop size 256) of MPRNet [16]. For a fair comparison, we apply the same data augmentation operations of BIDeN to MPRNet. We use the official PyTorch implementation (link) of MPRNet. We train MPRNet using 4 Tesla P100-PCIE-16GB GPU, the GPU driver version is 415.27 and the CUDA version is 10.0. The runtime of MPRNet is 20 hours and the model size of MPRNet is 41.8 MB. Restormer. We follow the training setting used in link. Similar to MPRNet, we apply the same data augmentation operations used in BIDeN to Restormer. We reproduce our results based on a PyTorch implementation (link) of Restormer [15]. We train Restormer using a GeForce RTX 3900 GPU, the GPU driver version is 510.47 and the CUDA version is 11.6. The runtime of Restormer is approximately 8 hours and the model size is 25.3 MB.

[1]  Clayton D. Scott,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Syed Waqas Zamir,et al.  Restormer: Efficient Transformer for High-Resolution Image Restoration , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Yang Liu,et al.  Auto-Exposure Fusion for Single-Image Shadow Removal , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Ling Shao,et al.  Multi-Stage Progressive Image Restoration , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Yang Liu,et al.  WDNet: Watermark-Decomposition Network for Visible Watermark Removal , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[6]  Zhenwei Shi,et al.  Deep Adversarial Decomposition: A Unified Framework for Separating Superimposed Images , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Cheng Shi,et al.  Towards Ghost-free Shadow Removal via Dual Hierarchical Aggregation Network and Shadow Matting GAN , 2019, AAAI.

[8]  Paul Newman,et al.  I Can See Clearly Now: Image Restoration via De-Raining , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[9]  Michal Irani,et al.  “Double-DIP”: Unsupervised Image Decomposition via Coupled Deep-Image-Priors , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Ren Ng,et al.  Single Image Reflection Separation with Perceptual Losses , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Le Hui,et al.  Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Jenq-Neng Hwang,et al.  DesnowNet: Context-Aware Deep Network for Snow Removal , 2017, IEEE Transactions on Image Processing.

[13]  Rynson W. H. Lau,et al.  DeshadowNet: A Multi-context Embedding Deep Network for Shadow Removal , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Shuicheng Yan,et al.  Deep Joint Rain Detection and Removal from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[16]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[17]  Dinesh Manocha,et al.  Appearance-preserving simplification , 1998, SIGGRAPH.

[18]  James F. Blinn,et al.  A Generalization of Algebraic Surface Drawing , 1982, TOGS.

[19]  Harshad Rai,et al.  Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .

[20]  Mohinder Malhotra Single Image Haze Removal Using Dark Channel Prior , 2016 .