Hyperparameter optimization in black-box image processing using differentiable proxies

Nearly every commodity imaging system we directly interact with, or indirectly rely on, leverages power efficient, application-adjustable black-box hardware image signal processing (ISPs) units, running either in dedicated hardware blocks, or as proprietary software modules on programmable hardware. The configuration parameters of these black-box ISPs often have complex interactions with the output image, and must be adjusted prior to deployment according to application-specific quality and performance metrics. Today, this search is commonly performed manually by "golden eye" experts or algorithm developers leveraging domain expertise. We present a fully automatic system to optimize the parameters of black-box hardware and software image processing pipelines according to any arbitrary (i.e., application-specific) metric. We leverage a differentiable mapping between the configuration space and evaluation metrics, parameterized by a convolutional neural network that we train in an end-to-end fashion with imaging hardware in-the-loop. Unlike prior art, our differentiable proxies allow for high-dimension parameter search with stochastic first-order optimizers, without explicitly modeling any lower-level image processing transformations. As such, we can efficiently optimize black-box image processing pipelines for a variety of imaging applications, reducing application-specific configuration times from months to hours. Our optimization method is fully automatic, even with black-box hardware in the loop. We validate our method on experimental data for real-time display applications, object detection, and extreme low-light imaging. The proposed approach outperforms manual search qualitatively and quantitatively for all domain-specific applications tested. When applied to traditional denoisers, we demonstrate that---just by changing hyperparameters---traditional algorithms can outperform recent deep learning methods by a substantial margin on recent benchmarks.

[1]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2]  M. J. D. Powell,et al.  A Method for Minimizing a Sum of Squares of Non-Linear Functions Without Calculating Derivatives , 1965, Comput. J..

[3]  Lei Zhang,et al.  Color demosaicking by local directional interpolation and nonlocal adaptive thresholding , 2011, J. Electronic Imaging.

[4]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Chyuan-Tyng Wu,et al.  Automatic ISP Image Quality Tuning Using Nonlinear Optimization , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[7]  Tao Liu,et al.  Hardware architecture design of block-matching and 3D-filtering denoising algorithm , 2016 .

[8]  Ruben Martinez-Cantin,et al.  BayesOpt: a Bayesian optimization library for nonlinear optimization, experimental design and bandits , 2014, J. Mach. Learn. Res..

[9]  Frédo Durand,et al.  Deep joint demosaicking and denoising , 2016, ACM Trans. Graph..

[10]  Jia Xu,et al.  Fast Image Processing with Fully-Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[11]  Wesley E. Snyder,et al.  Color Image Processing Pipeline in Digital Still Cameras , 2004 .

[12]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[13]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[14]  Thomas A. Dow,et al.  Design tools for freeform optics , 2005, SPIE Optics + Photonics.

[15]  V. S. Wong Real-time image enhancement , 1981 .

[16]  Stephen Lin,et al.  A High-Quality Denoising Dataset for Smartphone Cameras , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17]  Jonathan T. Barron,et al.  Burst photography for high dynamic range and low-light imaging on mobile cameras , 2016, ACM Trans. Graph..

[18]  Jaakko Lehtinen,et al.  Differentiable Monte Carlo ray tracing through edge sampling , 2018, ACM Trans. Graph..

[19]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[20]  Yunjin Chen,et al.  Trainable Nonlinear Reaction Diffusion: A Flexible Framework for Fast and Effective Image Restoration , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Daniel Malacara-Hernández,et al.  Handbook of Optical Design, Third Edition , 2013 .

[22]  Pieter Peers,et al.  Compressive light transport sensing , 2009, ACM Trans. Graph..

[23]  Jia Xu,et al.  Learning to See in the Dark , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Chun-Liang Li,et al.  Beyond Pixel Norm-Balls: Parametric Adversaries using an Analytically Differentiable Renderer , 2018, ICLR.

[25]  L. Shao,et al.  From Heuristic Optimization to Dictionary Learning: A Review and Comprehensive Comparison of Image Denoising Algorithms , 2014, IEEE Transactions on Cybernetics.

[26]  Norman Koren,et al.  The Imatest program: comparing cameras with different amounts of sharpening , 2006, Electronic Imaging.

[27]  Jonathan B. Phillips,et al.  Camera Image Quality Benchmarking , 2018 .

[28]  Kari Pulli,et al.  FlexISP , 2014, ACM Trans. Graph..

[29]  David G. Stork,et al.  Lensless Ultra-Miniature CMOS Computational Imagers and Sensors , 2013 .

[30]  Xin-She Yang,et al.  A literature survey of benchmark functions for global optimisation problems , 2013, Int. J. Math. Model. Numer. Optimisation.

[31]  Alessandro Foi,et al.  Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering , 2007, IEEE Transactions on Image Processing.

[32]  Geoffrey E. Hinton,et al.  NeuroAnimator: fast neural network emulation and control of physics-based models , 1998, SIGGRAPH.

[33]  Daniel Malacara,et al.  Handbook of optical design , 1994 .

[34]  W.E. Snyder,et al.  Color image processing pipeline , 2005, IEEE Signal Processing Magazine.

[35]  Andrea Vedaldi,et al.  Deep Image Prior , 2017, International Journal of Computer Vision.

[36]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[37]  Steve Marschner,et al.  Dual photography , 2005, ACM Trans. Graph..

[38]  Daniel Rueckert,et al.  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[40]  David G. Stork,et al.  Lensless Ultra-Miniature Imagers Using Odd-Symmetry Spiral Phase Gratings , 2013 .

[41]  Stefan Harmeling,et al.  Image denoising: Can plain neural networks compete with BM3D? , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Donald Baxter,et al.  Development of the I3A CPIQ spatial metrics , 2011, Electronic Imaging.

[43]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[44]  Jonathan T. Barron,et al.  Deep bilateral learning for real-time image enhancement , 2017, ACM Trans. Graph..

[45]  D. Ackley A connectionist machine for genetic hillclimbing , 1987 .

[46]  Jan Kautz,et al.  Local Laplacian filters , 2015, Commun. ACM.

[47]  Hans-Georg Beyer,et al.  Limited-Memory Matrix Adaptation for Large Scale Black-box Optimization , 2017, ArXiv.

[48]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Youngbae Hwang,et al.  Memory optimization of bilateral filter and its hardware implementation , 2014, The 18th IEEE International Symposium on Consumer Electronics (ISCE 2014).

[50]  Petros Koumoutsakos,et al.  Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.

[51]  Renjie Liao,et al.  Deep Edge-Aware Filters , 2015, ICML.

[52]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[53]  Yair Weiss,et al.  From learning models of natural image patches to whole image restoration , 2011, 2011 International Conference on Computer Vision.

[54]  Jean-Michel Morel,et al.  A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[55]  Stephen P. Boyd,et al.  End-to-end optimization of optics and image processing for achromatic extended depth of field and super-resolution imaging , 2018, ACM Trans. Graph..

[56]  David Moloney,et al.  Always-on Vision Processing Unit for Mobile Applications , 2015, IEEE Micro.

[57]  David D. Cox,et al.  Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms , 2013, SciPy.

[58]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[59]  Wangmeng Zuo,et al.  Toward Convolutional Blind Denoising of Real Photographs , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Jasper Snoek,et al.  Multi-Task Bayesian Optimization , 2013, NIPS.

[61]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[62]  Jiaolong Yang,et al.  Image smoothing via unsupervised learning , 2018, ACM Trans. Graph..

[63]  Timo Aila,et al.  Interactive reconstruction of Monte Carlo image sequences using a recurrent denoising autoencoder , 2017, ACM Trans. Graph..

[64]  David Zhang,et al.  Two-stage image denoising by principal component analysis with local pixel grouping , 2010, Pattern Recognit..

[65]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[66]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[67]  Ashok Veeraraghavan,et al.  Structured light 3D scanning in the presence of global illumination , 2011, CVPR 2011.

[68]  Michael Elad,et al.  Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[69]  Michael J. Black,et al.  Fields of Experts: a framework for learning image priors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[70]  Peyman Milanfar,et al.  Global Image Denoising , 2014, IEEE Transactions on Image Processing.

[71]  Lei Zhang,et al.  Weighted Nuclear Norm Minimization with Application to Image Denoising , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[72]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[73]  Feng Liu,et al.  Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries in Wavelet Domain , 2009, 2009 Fifth International Conference on Image and Graphics.