PixelRL: Fully Convolutional Network With Reinforcement Learning for Image Processing

This article tackles a new problem setting: reinforcement learning with pixel-wise rewards (pixelRL) for image processing. After the introduction of the deep Q-network, deep RL has been achieving great success. However, the applications of deep reinforcement learning (RL) for image processing are still limited. Therefore, we extend deep RL to pixelRL for various image processing applications. In pixelRL, each pixel has an agent, and the agent changes the pixel value by taking an action. We also propose an effective learning method for pixelRL that significantly improves the performance by considering not only the future states of the own pixel but also those of the neighbor pixels. The proposed method can be applied to some image processing tasks that require pixel-wise manipulations, where deep RL has never been applied. Besides, it is possible to visualize what kind of operation is employed for each pixel at each iteration, which would help us understand why and how such an operation is chosen. We also believe that our technology can enhance the explainability and interpretability of the deep neural networks. In addition, because the operations executed at each pixels are visualized, we can change or modify the operations if necessary. We apply the proposed method to a variety of image processing tasks: image denoising, image restoration, local color enhancement, and saliency-driven image editing. Our experimental results demonstrate that the proposed method achieves comparable or better performance, compared with the state-of-the-art methods based on supervised learning. The source code is available on https://github.com/rfuruta/pixelRL.

[1]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[2]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[3]  Li Xu,et al.  Shepard Convolutional Neural Networks , 2015, NIPS.

[4]  Yiyan Chen,et al.  Weakly Supervised Video Summarization by Hierarchical Reinforcement Learning , 2019, MMAsia.

[5]  Guillermo Sapiro,et al.  Simultaneous structure and texture image inpainting , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[6]  Shuicheng Yan,et al.  Tree-Structured Reinforcement Learning for Sequential Object Localization , 2016, NIPS.

[7]  Yang Liu,et al.  Deep Blind Image Inpainting , 2017, IScIDE.

[8]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[9]  Stamatios Lefkimmiatis,et al.  Non-local Color Image Denoising with Convolutional Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Toshihiko Yamasaki,et al.  Fully Convolutional Network with Multi-Step Reinforcement Learning for Image Processing , 2018, AAAI.

[11]  Silvio Savarese,et al.  Learning to Track: Online Multi-object Tracking by Decision Making , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Alessandro Foi,et al.  Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering , 2007, IEEE Transactions on Image Processing.

[13]  Liang Lin,et al.  Attention-Aware Face Hallucination via Deep Reinforcement Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Deva Ramanan,et al.  Tracking as Online Decision-Making: Learning a Policy from Streaming Videos with Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Michael Elad,et al.  Sparse Representation for Color Image Restoration , 2008, IEEE Transactions on Image Processing.

[17]  Alvaro Soto,et al.  Human detection using a mobile platform and novel features derived from a visual saliency mechanism , 2010, Image Vis. Comput..

[18]  Stefan Harmeling,et al.  Image denoising: Can plain neural networks compete with BM3D? , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[20]  Gang Hua,et al.  Collaborative Deep Reinforcement Learning for Joint Object Search , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[22]  Yizhou Yu,et al.  Automatic Photo Adjustment Using Deep Neural Networks , 2014, ACM Trans. Graph..

[23]  Jean-Michel Morel,et al.  A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[24]  Bingbing Ni,et al.  Image Re-Attentionizing , 2013, IEEE Transactions on Multimedia.

[25]  Amitabh Varshney,et al.  Saliency-guided Enhancement for Volume Visualization , 2006, IEEE Transactions on Visualization and Computer Graphics.

[26]  Michael J. Black,et al.  Fields of Experts: a framework for learning image priors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[27]  Ludovic Denoyer,et al.  Structured prediction with reinforcement learning , 2009, Machine Learning.

[28]  Guillermo Sapiro,et al.  Non-local sparse models for image restoration , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[29]  Luc Van Gool,et al.  NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[30]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[31]  Erik Reinhard,et al.  Color Transfer between Images , 2001, IEEE Computer Graphics and Applications.

[32]  Patrick Pérez,et al.  Region filling and object removal by exemplar-based image inpainting , 2004, IEEE Transactions on Image Processing.

[33]  Oriol Vinyals,et al.  Synthesizing Programs for Images using Reinforced Adversarial Learning , 2018, ICML.

[34]  Frédo Durand,et al.  De-emphasis of distracting image regions using texture power maps , 2005, APGV '05.

[35]  Ming-Hsuan Yang,et al.  PiCANet: Learning Pixel-Wise Contextual Attention for Saliency Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Lihi Zelnik-Manor,et al.  What Makes a Patch Distinct? , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Guillermo Sapiro,et al.  Image inpainting , 2000, SIGGRAPH.

[38]  Svetlana Lazebnik,et al.  Active Object Localization with Deep Reinforcement Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[39]  Thomas A. Funkhouser,et al.  Dilated Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Kaiqi Huang,et al.  A2-RL: Aesthetics Aware Reinforcement Learning for Automatic Image Cropping , 2017, ArXiv.

[41]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[42]  Markus Wulfmeier,et al.  Maximum Entropy Deep Inverse Reinforcement Learning , 2015, 1507.04888.

[43]  Jiwen Lu,et al.  Attention-Aware Deep Reinforcement Learning for Video Face Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[44]  Dale Schuurmans,et al.  Bridging the Gap Between Value and Policy Based Reinforcement Learning , 2017, NIPS.

[45]  Yunjin Chen,et al.  Trainable Nonlinear Reaction Diffusion: A Flexible Framework for Fast and Effective Image Restoration , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Wangmeng Zuo,et al.  Learning Deep CNN Denoiser Prior for Image Restoration , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Amit K. Roy-Chowdhury,et al.  FFNet: Video Fast-Forwarding via Reinforcement Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[48]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Ivan V. Bajic,et al.  Attention Retargeting by Color Manipulation in Images , 2014, PIVP '14.

[50]  A. Sugimoto,et al.  Saliency-based image editing for guiding visual attention , 2011, PETMEI '11.

[51]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[52]  Jonathan T. Barron,et al.  Deep bilateral learning for real-time image enhancement , 2017, ACM Trans. Graph..

[53]  Takuya Akiba,et al.  Chainer: A Deep Learning Framework for Accelerating the Research Cycle , 2019, KDD.

[54]  Liang Lin,et al.  Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Lihi Zelnik-Manor,et al.  Saliency Driven Image Manipulation , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[56]  Kok-Lim Low,et al.  Saliency retargeting: An approach to enhance image aesthetics , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[57]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[58]  Hao He,et al.  Exposure , 2017, ACM Trans. Graph..

[59]  Lei Zhang,et al.  Weighted Nuclear Norm Minimization with Application to Image Denoising , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[60]  Amitabh Varshney,et al.  Persuading Visual Attention through Geometry , 2008, IEEE Transactions on Visualization and Computer Graphics.

[61]  In-So Kweon,et al.  Distort-and-Recover: Color Enhancement Using Deep Reinforcement Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[62]  Jia Xu,et al.  Fast Image Processing with Fully-Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[63]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[64]  Jian Sun,et al.  Guided Image Filtering , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[65]  Lei Zhang,et al.  Waterloo Exploration Database: New Challenges for Image Quality Assessment Models , 2017, IEEE Transactions on Image Processing.

[66]  Huchuan Lu,et al.  Hierarchical Cellular Automata for Visual Saliency , 2017, International Journal of Computer Vision.

[67]  Enhong Chen,et al.  Image Denoising and Inpainting with Deep Neural Networks , 2012, NIPS.

[68]  Victor A. Mateescu,et al.  Guiding visual attention by manipulating orientation in images , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[69]  Cristian Sminchisescu,et al.  Reinforcement Learning for Visual Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  Jin Young Choi,et al.  Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[71]  Thierry Blu,et al.  Image Denoising in Mixed Poisson–Gaussian Noise , 2011, IEEE Transactions on Image Processing.

[72]  Muhammad Imran Razzak,et al.  Deep Learning for Medical Image Processing: Overview, Challenges and Future , 2017, ArXiv.

[73]  Zhiming Luo,et al.  Non-local Deep Features for Salient Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[74]  Ashish Kapoor,et al.  Context-Based Automatic Local Image Enhancement , 2012, ECCV.