Spectrally Consistent UNet for High Fidelity Image Transformations

Convolutional Neural Networks (CNNs) are the current de-facto approach used for many imaging tasks due to their high learning capacity as well as their architectural qualities. The ubiquitous UNet architecture provides an efficient and multi-scale solution that combines local and global information. Despite the success of UNet architectures, the use of upsampling layers can cause checkerboard artefacts or blurring. In this work, a method for assessing the structural biases of UNets and the effects these have on the outputs is presented, characterising their impact in the Fourier domain. A new upsampling module is then proposed, based on a novel generalisation of the Guided Image Filter, that provides spectrally consistent outputs when used in a UNet architecture, forming the Guided UNet (GUNet). The GUNet architecture is evaluated quantitatively and qualitatively in an example application of dynamic range expansion for high dynamic range imaging. The proposed method provides higher fidelity results, while executing faster and consuming less memory than other dedicated architectures that avoid upsampling.

[1]  Jian Sun,et al.  Fast Guided Filter , 2015, ArXiv.

[2]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3]  Hayden Landis,et al.  Production-Ready Global Illumination , 2004 .

[4]  Kaiqi Huang,et al.  Fast End-to-End Trainable Guided Filter , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[6]  Fan Yang,et al.  Physiological inverse tone mapping based on retina response , 2013, The Visual Computer.

[7]  Hiroshi Ishikawa,et al.  Globally and locally consistent image completion , 2017, ACM Trans. Graph..

[8]  Wolfgang Heidrich,et al.  Ldr2Hdr: on-the-fly reverse tone mapping of legacy video and photographs , 2007, ACM Trans. Graph..

[9]  Patrick Le Callet,et al.  HDR-VDP-2.2: a calibrated method for objective quality prediction of high-dynamic range and standard images , 2014, J. Electronic Imaging.

[10]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Yoshihiro Kanamori,et al.  Deep reverse tone mapping , 2017, ACM Trans. Graph..

[12]  Vincent Dumoulin,et al.  Deconvolution and Checkerboard Artifacts , 2016 .

[13]  Thomas Bashford-Rogers,et al.  ExpandNet: A Deep Convolutional Neural Network for High Dynamic Range Expansion from Low Dynamic Range Content , 2018, Comput. Graph. Forum.

[14]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[15]  Takeshi Naemura,et al.  Hybrid Loss for Learning Single-Image-based HDR Reconstruction , 2018, ArXiv.

[16]  Jian Sun,et al.  Guided Image Filtering , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Zhou Wang,et al.  Multiscale structural similarity for image quality assessment , 2003, The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003.

[18]  Christian Ledig,et al.  Checkerboard artifact free sub-pixel convolution: A note on sub-pixel convolution, resize convolution and convolution resize , 2017, ArXiv.

[19]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[20]  Alireza Fathi,et al.  The Devil is in the Decoder: Classification, Regression and GANs , 2017, International Journal of Computer Vision.

[21]  Jinsong Zhang,et al.  Learning High Dynamic Range from Outdoor Panoramas , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Erik Reinhard,et al.  Do HDR displays support LDR content?: a psychophysical evaluation , 2007, ACM Trans. Graph..

[24]  Hitoshi Kiya,et al.  Super-Resolution Using Convolutional Neural Networks Without Any Checkerboard Artifacts , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[25]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[26]  Manuel Menezes de Oliveira Neto,et al.  High-Quality Reverse Tone Mapping for a Wide Range of Exposures , 2014, 2014 27th SIBGRAPI Conference on Graphics, Patterns and Images.

[27]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[29]  Hiroshi Ishikawa,et al.  Let there be color! , 2016, ACM Trans. Graph..

[30]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[31]  Diego Gutierrez,et al.  Evaluation of reverse tone mapping through varying exposure conditions , 2009, ACM Trans. Graph..

[32]  Suk-Ju Kang,et al.  Deep Chain HDRI: Reconstructing a High Dynamic Range Image from a Single Low Dynamic Range Image , 2018, IEEE Access.

[33]  Michael Unser,et al.  Deep Convolutional Neural Network for Inverse Problems in Imaging , 2016, IEEE Transactions on Image Processing.

[34]  Sergio Guadarrama,et al.  The Devil is in the Decoder , 2017, BMVC.

[35]  Francesco Banterle,et al.  Inverse tone mapping , 2006, GRAPHITE '06.

[36]  Daniel Rueckert,et al.  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[39]  Gabriel Eilertsen,et al.  HDR image reconstruction from a single exposure using deep CNNs , 2017, ACM Trans. Graph..

[40]  Dani Lischinski,et al.  Joint bilateral upsampling , 2007, ACM Trans. Graph..

[41]  Mark D. Fairchild,et al.  The HDR Photographic Survey , 2007, CIC.