Fast End-to-End Trainable Guided Filter

Image processing and pixel-wise dense prediction have been advanced by harnessing the capabilities of deep learning. One central issue of deep learning is the limited capacity to handle joint upsampling. We present a deep learning building block for joint upsampling, namely guided filtering layer. This layer aims at efficiently generating the high-resolution output given the corresponding low-resolution one and a high-resolution guidance map. The proposed layer is composed of a guided filter, which is reformulated as a fully differentiable block. To this end, we show that a guided filter can be expressed as a group of spatial varying linear transformation matrices. This layer could be integrated with the convolutional neural networks (CNNs) and jointly optimized through end-to-end training. To further take advantage of end-to-end training, we plug in a trainable transformation function that generates task-specific guidance maps. By integrating the CNNs and the proposed layer, we form deep guided filtering networks. The proposed networks are evaluated on five advanced image processing tasks. Experiments on MIT-Adobe FiveK Dataset demonstrate that the proposed approach runs 10-100× faster and achieves the state-of-the-art performance. We also show that the proposed guided filtering layer helps to improve the performance of multiple pixel-wise dense prediction tasks. The code is available at https://github.com/wuhuikai/DeepGuidedFilter.

[1]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[2]  Frédo Durand,et al.  Transform recipes for efficient cloud photo enhancement , 2015, ACM Trans. Graph..

[3]  Yizhou Yu,et al.  Visual saliency based on multiscale deep features , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Raanan Fattal,et al.  Single image dehazing , 2008, ACM Trans. Graph..

[5]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[6]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[7]  Jingdong Wang,et al.  Salient Object Detection: A Discriminative Regional Feature Integration Approach , 2013, International Journal of Computer Vision.

[8]  Stefan Harmeling,et al.  Image denoising: Can plain neural networks compete with BM3D? , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Sylvain Paris,et al.  Learning photographic global tonal adjustment with a database of input / output image pairs , 2011, CVPR 2011.

[10]  Rob Fergus,et al.  Restoring an Image Taken through a Window Covered with Dirt or Rain , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Mohinder Malhotra Single Image Haze Removal Using Dark Channel Prior , 2016 .

[12]  Oisin Mac Aodha,et al.  Unsupervised Monocular Depth Estimation with Left-Right Consistency , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[14]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Fast End-to-End Trainable Guided Filter Supplementary Material , 2018 .

[16]  Zeev Farbman,et al.  Edge-preserving decompositions for multi-scale tone and detail manipulation , 2008, ACM Trans. Graph..

[17]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[18]  Dani Lischinski,et al.  Joint bilateral upsampling , 2007, ACM Trans. Graph..

[19]  Nanning Zheng,et al.  Learning to Detect a Salient Object , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Jian Sun,et al.  High quality image reconstruction from RAW and JPEG image pair , 2011, 2011 International Conference on Computer Vision.

[21]  Xiaochun Cao,et al.  Single Image Dehazing via Multi-scale Convolutional Neural Networks , 2016, ECCV.

[22]  Li Xu,et al.  Mutual-Structure for Joint Filtering , 2015, ICCV.

[23]  Frédo Durand,et al.  Fast Local Laplacian Filters , 2014, ACM Trans. Graph..

[24]  Jian Sun,et al.  Guided Image Filtering , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Shai Avidan,et al.  Non-local Image Dehazing , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Subhransu Maji,et al.  Semantic contours from inverse detectors , 2011, 2011 International Conference on Computer Vision.

[27]  Andrew Adams,et al.  Fast High‐Dimensional Filtering Using the Permutohedral Lattice , 2010, Comput. Graph. Forum.

[28]  Hiroshi Ishikawa,et al.  Let there be color! , 2016, ACM Trans. Graph..

[29]  Narendra Ahuja,et al.  Deep Joint Image Filtering , 2016, ECCV.

[30]  Ashutosh Saxena,et al.  Learning Depth from Single Monocular Images , 2005, NIPS.

[31]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[32]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Li Xu,et al.  Structure extraction from texture via relative total variation , 2012, ACM Trans. Graph..

[34]  Jean Ponce,et al.  Robust image filtering using joint static and dynamic guidance , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Xiaopeng Zhang,et al.  Cross-Field Joint Image Restoration via Scale Map , 2013, 2013 IEEE International Conference on Computer Vision.

[36]  Frédo Durand,et al.  Deep joint demosaicking and denoising , 2016, ACM Trans. Graph..

[37]  Jia Xu,et al.  Fast Image Processing with Fully-Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38]  Jonathan T. Barron,et al.  Deep bilateral learning for real-time image enhancement , 2017, ACM Trans. Graph..

[39]  Rama Chellappa,et al.  Gaussian Conditional Random Field Network for Semantic Segmentation , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Zeev Farbman,et al.  Edge-preserving decompositions for multi-scale tone and detail manipulation , 2008, SIGGRAPH 2008.

[41]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[42]  Jonathan T. Barron,et al.  The Fast Bilateral Solver , 2015, ECCV.

[43]  Zhuowen Tu,et al.  Deeply Supervised Salient Object Detection with Short Connections , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Yizhou Yu,et al.  Automatic Photo Adjustment Using Deep Neural Networks , 2014, ACM Trans. Graph..

[45]  Renjie Liao,et al.  Deep Edge-Aware Filters , 2015, ICML.

[46]  Jiawen Chen,et al.  Bilateral guided upsampling , 2016, ACM Trans. Graph..

[47]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[48]  Frédo Durand,et al.  Two-scale tone management for photographic look , 2006, ACM Trans. Graph..

[49]  Jiaya Jia,et al.  Deep Automatic Portrait Matting , 2016, ECCV.

[50]  Qi Zhang,et al.  Rolling Guidance Filter , 2014, ECCV.

[51]  Vibhav Vineet,et al.  Conditional Random Fields as Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[52]  Ce Liu,et al.  Deep Convolutional Neural Network for Image Deconvolution , 2014, NIPS.

[53]  Frédo Durand,et al.  Edge-preserving multiscale image decomposition based on local extrema , 2009, ACM Trans. Graph..

[54]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[55]  Xin Zhao,et al.  Locality-Sensitive Deconvolution Networks with Gated Fusion for RGB-D Indoor Semantic Segmentation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Cewu Lu,et al.  Image smoothing via L0 gradient minimization , 2011, ACM Trans. Graph..

[57]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Marc Levoy,et al.  Gaussian KD-trees for fast high-dimensional filtering , 2009, ACM Trans. Graph..

[59]  Manuel Menezes de Oliveira Neto,et al.  Adaptive manifolds for real-time high-dimensional filtering , 2012, ACM Trans. Graph..

[60]  Raanan Fattal,et al.  Dehazing Using Color-Lines , 2014, ACM Trans. Graph..

[61]  Ming-Hsuan Yang,et al.  Learning Recursive Filters for Low-Level Vision via a Hybrid Neural Network , 2016, ECCV.