Joint Demosaicking and Denoising in the Wild: The Case of Training Under Ground Truth Uncertainty

Image demosaicking and denoising are the two key fundamental steps in digital camera pipelines, aiming to reconstruct clean color images from noisy luminance readings. In this paper, we propose and study Wild-JDD, a novel learning framework for joint demosaicking and denoising in the wild. In contrast to previous works which generally assume the ground truth of training data is a perfect reflection of the reality, we consider here the more common imperfect case of ground truth uncertainty in the wild. We first illustrate its manifestation as various kinds of artifacts including zipper effect, color moire and residual noise. Then we formulate a two-stage data degradation process to capture such ground truth uncertainty, where a conjugate prior distribution is imposed upon a base distribution. After that, we derive an evidence lower bound (ELBO) loss to train a neural network that approximates the parameters of the conjugate prior distribution conditioned on the degraded input. Finally, to further enhance the performance for out-of-distribution input, we design a simple but effective fine-tuning strategy by taking the input as a weakly informative prior. Taking into account ground truth uncertainty, Wild-JDD enjoys good interpretability during optimization. Extensive experiments validate that it outperforms state-of-the-art schemes on joint demosaicking and denoising tasks on both synthetic and realistic raw datasets. Introduction Modern digital cameras use a single sensor overlaid with a color filter array (CFA) to capture an image. This means that only one color channel’s value is recorded for each pixel location. LetN be the number of pixels in an image, the raw data acquisition process can be simply modeled as x = Az + n, (1) where x ∈ R is a noisy raw data vector of luminance readings, A ∈ RN×3N is a mosaicking operation, z ∈ R is an unknown clean image with three color channels, and n ∈ R is a noise vector. Before the final “cooked” image is ready for the users, the raw data undergoes a series of processing steps, known as the image processing pipeline. Among those, demosaicking and denoising (DM&DN) are two of the very early and Copyright © 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. (a) Zipper effect. (b) Color moire. (c) Residual noise. Figure 1: Imperfect ground truth examples (electronic zoomin recommended): (a) A ground truth image from CBSD dataset (Arbeláez et al. 2011) suffering from zipper effect, an artificial jagged pattern around edges; (b) Color moire in an image from ImageNet dataset (Russakovsky et al. 2015). Such artifact appears as false coloring due to interpolation error; (c) Noticeable residual noise in the collected “clean” image from Renoir dataset (Anaya and Barbu 2018). most crucial steps. Demosaicking aims to undo the mosaicking operation A by interpolating the missing two-thirds of each pixel’s color channels, while denoising removes the inevitable noise n from the measurement x. Due to their modular property, substantial traditional literature takes them as independent tasks and executes them in a sequential manner. This yields potentially suboptimal performance, and inspires several works on jointly addressing the DM&DN tasks (Liu et al. 2020; Kokkinos and Lefkimmiatis 2019; Tan et al. 2017a). Among the joint DM&DN works, data-driven approaches (Liu et al. 2020; Tan et al. 2018; Kokkinos and Lefkimmiatis 2018) have been shown more effective than applying handcrafted priors and filters. These approaches usually require a collection of paired data, which are the mosaicked noisy images x and the demosaicked clean “ground truth” counterparts y. However, it is often costly and tedious to collect a large amount of high quality real-life data. Furthermore, the collected y is not perfect without artifacts or noise. We illustrate this in Figure 1. For demosaicking, many approaches (Syu, Chen, and Chuang 2018; Tan et al. 2017b) take the output from a camera pipeline as y, possibly introducing artifacts like zipper effect or color moire in regions with rich textures and sharp edges. For denoising, the ar X iv :2 10 1. 04 44 2v 1 [ cs .C V ] 1 2 Ja n 20 21 “clean” images are often collected by either setting a lowISO (Plotz and Roth 2017; Anaya and Barbu 2018) or averaging a set of repeated shots of the same scene (Abdelhamed, Lin, and Brown 2018), which still contain noticeable noise. Moreover, such denoising data collecting process usually assumes the captured objects to be perfectly still, or requires a precise spatial alignment and intensity calibration among a burst of images. Potential failure cases would introduce additional error into the collected dataset. Therefore, all these in-the-wild issues means that the “ground truth” y deviates from the needed authentic z, limiting the performance of DM&DN model. To account for the fact that the collected ground truth y is not a perfect reflection of z, we propose Wild-JDD, a novel joint demosaiking and denoising learning framework to enable training under ground truth uncertainty. In WildJDD, we first formulate a two-stage data degradation process, where a conjugate prior distribution is imposed upon a base Gaussian distribution. Then, we derive an ELBO loss from a variational perspective. In this way, the optimization process is aware of the target uncertainty and prevents the trained neural network from over-fitting to those randomness errors. Beyond that, when the testing image falls outside of the training range, we further enhance the performance by regarding the input as a weakly informative prior. Our main contributions are summarized as follows: • We identify in existing DM&DN datasets the ground truth uncertainty issues, manifesting themselves as various artifacts in the wild, such as zipper effect, color moire and residual noise. • We introduce a novel learning framework for joint demosaicking and denoising in the wild (Wild-JDD), where a two-stage data degradation and an ELBO loss are formulated for optimization. We also propose a simple but effective fine-tuning strategy for out-of-distribution input. • Instead of simply generating a demosaicked clean image, networks instantiated from our framework are capable of estimating all the parameters involved in data degradation and reconstruction, which provides better interpretability of the optimization process. • We conduct extensive experiments on both synthetic and realistic datasets. Quantitative and qualitative comparisons show that Wild-JDD substantially outperforms state-of-the-art works.

[1]  Gabriele Facciolo,et al.  Joint Demosaicking and Denoising by Fine-Tuning of Bursts of Raw Images , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Jaakko Lehtinen,et al.  Noise2Noise: Learning Image Restoration without Clean Data , 2018, ICML.

[3]  Yun Fu,et al.  Residual Dense Network for Image Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Loïc Royer,et al.  Noise2Self: Blind Denoising by Self-Supervision , 2019, ICML.

[5]  Weidong Min,et al.  No-reference/Blind Image Quality Assessment: A Survey , 2017 .

[6]  Florian Jug,et al.  Noise2Void - Learning Denoising From Single Noisy Images , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Kyoung Mu Lee,et al.  Enhanced Deep Residual Networks for Single Image Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8]  Wangmeng Zuo,et al.  COLOR IMAGE DEMOSAICKING VIA DEEP RESIDUAL LEARNING , 2017 .

[9]  Lu Fang,et al.  Joint Denoising and demosaicking of noisy CFA images based on inter-color correlation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Thomas Pock,et al.  Learning joint demosaicing and denoising based on sequential energy minimization , 2016, 2016 IEEE International Conference on Computational Photography (ICCP).

[11]  Stefan Roth,et al.  Benchmarking Denoising Algorithms with Real Photographs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[13]  Baoxin Li,et al.  Color image demosaicking using inter-channel correlation and nonlocal self-similarity , 2015, Signal Process. Image Commun..

[14]  Stamatios Lefkimmiatis,et al.  Deep Image Demosaicking using a Cascade of Convolutional Residual Denoising Networks , 2018, ECCV.

[15]  Qin Xu,et al.  Learning Raw Image Denoising With Bayer Pattern Unification and Bayer Preserving Augmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[16]  Narendra Ahuja,et al.  Single image super-resolution from transformed self-exemplars , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Tao Huang,et al.  Lightweight Deep Residue Learning for Joint Color Image Demosaicking and Denoising , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[18]  Luc Van Gool,et al.  NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[19]  Jungho Yoon,et al.  Joint Demosaicing and Denoising Based on a Variational Deep Image Prior Neural Network , 2020, Sensors.

[20]  Andrew W. Fitzgibbon,et al.  Joint Demosaicing and Denoising via Learned Nonparametric Random Fields , 2014, IEEE Transactions on Image Processing.

[21]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Yu-Sheng Chen,et al.  Learning Deep Convolutional Networks for Demosaicing , 2018, ArXiv.

[23]  Yasuyuki Matsushita,et al.  A Holistic Approach to Cross-Channel Image Noise Modeling and Its Application to Image Denoising , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Thomas Huang,et al.  Adaptation Strategies for Applying AWGN-Based Denoiser to Realistic Noise , 2019, AAAI.

[25]  Inbar Mosseri,et al.  XGAN: Unsupervised Image-to-Image Translation for many-to-many Mappings , 2017, Domain Adaptation for Visual Understanding.

[26]  David Zhang,et al.  PCA-Based Spatially Adaptive Denoising of CFA Images for Single-Sensor Digital Cameras , 2009, IEEE Transactions on Image Processing.

[27]  Lin Liu,et al.  Joint Demosaicing and Denoising With Self Guidance , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Lei Zhang,et al.  Color demosaicking by local directional interpolation and nonlocal adaptive thresholding , 2011, J. Electronic Imaging.

[29]  Deyu Meng,et al.  Variational Denoising Network: Toward Blind Noise Modeling and Removal , 2019, NeurIPS.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Xiangchu Feng,et al.  FOCNet: A Fractional Optimal Control Network for Image Denoising , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Stephen Lin,et al.  A High-Quality Denoising Dataset for Smartphone Cameras , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[34]  Jean-Michel Morel,et al.  A Review of an Old Dilemma: Demosaicking First, or Denoising First? , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[35]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[36]  Adrian Barbu,et al.  RENOIR - A Benchmark Dataset for Real Noise Reduction Evaluation , 2014, ArXiv.

[37]  Yu Liu,et al.  Joint demosaicing and denoising of noisy bayer images with ADMM , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[38]  Masatoshi Okutomi,et al.  Pseudo four-channel image denoising for noisy CFA raw data , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[39]  Frédo Durand,et al.  Deep joint demosaicking and denoising , 2016, ACM Trans. Graph..

[40]  Kari Pulli,et al.  FlexISP , 2014, ACM Trans. Graph..

[41]  Stamatios Lefkimmiatis,et al.  Iterative Joint Image Demosaicking and Denoising Using a Residual Denoising Network , 2018, IEEE Transactions on Image Processing.