论文信息 - Bridging the Gap Between Computational Photography and Visual Recognition

Bridging the Gap Between Computational Photography and Visual Recognition

What is the current state-of-the-art for image restoration and enhancement applied to degraded images acquired under less than ideal circumstances? Can the application of such algorithms as a pre-processing step improve image interpretability for manual analysis or automatic visual recognition to classify scene content? While there have been important advances in the area of computational photography to restore or enhance the visual quality of an image, the capabilities of such techniques have not always translated in a useful way to visual recognition tasks. To address this, we introduce the UG 2 dataset as a large-scale benchmark composed of video imagery captured under challenging conditions, and two enhancement tasks designed to test algorithmic impact on visual quality and automatic object recognition. Furthermore, we propose a set of metrics to evaluate the joint improvement of such tasks as well as individual algorithmic advances, including a novel psychophysics-based evaluation regime for human assessment and a realistic set of quantitative measures for object recognition performance. We introduce six new algorithms for image restoration or enhancement, which were created as part of the IARPA sponsored UG 2 Challenge workshop held at CVPR 2018.

[1] Liang Lin,et al. Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2] Wolfgang Heidrich,et al. Learning High-Order Filters for Efficient Blind Deconvolution of Document Photographs , 2016, ECCV.

[3] Pierre Alliez,et al. Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[4] Xiaoming Liu,et al. Multi-Frame Super-Resolution for Face Recognition , 2007, 2007 First IEEE International Conference on Biometrics: Theory, Applications, and Systems.

[5] Klemen Grm,et al. UG^2: A Video Benchmark for Assessing the Impact of Image Restoration and Enhancement on Automatic Visual Recognition , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[6] Eirikur Agustsson,et al. NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[7] Masashi Nishiyama,et al. Facial deblur inference to improve recognition of blurred faces , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Larry S. Davis,et al. AVSS 2011 demo session: A large-scale benchmark dataset for event recognition in surveillance video , 2011, AVSS.

[9] Ling Shao,et al. iSAID: A Large-scale Dataset for Instance Segmentation in Aerial Images , 2019, CVPR Workshops.

[10] Narendra Ahuja,et al. Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Andrea Vedaldi,et al. Deep Image Prior , 2017, International Journal of Computer Vision.

[12] Alan C. Bovik,et al. A Statistical Evaluation of Recent Full Reference Image Quality Assessment Algorithms , 2006, IEEE Transactions on Image Processing.

[13] Bernard Ghanem,et al. A Benchmark and Simulator for UAV Tracking , 2016, ECCV.

[14] Stephen Lin,et al. Separation of diffuse and specular reflection in color images , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[15] Alexei A. Efros,et al. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[16] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[17] Jiebo Luo,et al. DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18] Jean-Michel Morel,et al. Image Denoising Methods. A New Nonlocal Principle , 2010, SIAM Rev..

[19] Mohinder Malhotra. Single Image Haze Removal Using Dark Channel Prior , 2016 .

[20] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[21] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Harshad Rai,et al. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks , 2018 .

[23] Wenhan Yang,et al. Erase or Fill? Deep Joint Recurrent Rain Removal and Reconstruction in Videos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24] Jiri Matas,et al. DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25] Shree K. Nayar,et al. Instant dehazing of images using polarization , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[26] Sridha Sridharan,et al. Evaluation of image resolution and super-resolution on face recognition performance , 2012, J. Vis. Commun. Image Represent..

[27] Luc Van Gool,et al. A+: Adjusted Anchored Neighborhood Regression for Fast Super-Resolution , 2014, ACCV.

[28] Sergio Escalera,et al. Convolutional Neural Network Super Resolution for Face Recognition in Surveillance Monitoring , 2016, AMDO.

[29] Jan Kotera,et al. Convolutional Neural Networks for Direct Text Deblurring , 2015, BMVC.

[30] Christiane Fellbaum,et al. Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[31] Ninad Thakoor,et al. Face recognition in video with closed-loop super-resolution , 2011, CVPR 2011 WORKSHOPS.

[32] Lina J. Karam,et al. A Study and Comparison of Human and Deep Learning Recognition Performance under Visual Distortions , 2017, 2017 26th International Conference on Computer Communication and Networks (ICCCN).

[33] Gholamreza Anbarjafari,et al. Facial image super resolution using sparse representation for improving face recognition in surveillance monitoring , 2016, 2016 24th Signal Processing and Communication Application Conference (SIU).

[34] William T. Freeman,et al. Example-Based Super-Resolution , 2002, IEEE Computer Graphics and Applications.

[35] João Batista Neto,et al. An empirical study on the effects of different types of noise in image classification tasks , 2016, ArXiv.

[36] R. Szeliski,et al. Image deblurring and denoising using color priors , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[37] Frédo Durand,et al. Understanding and evaluating blind deconvolution algorithms , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[38] Shaogang Gong,et al. Video Synopsis by Heterogeneous Multi-source Correlation , 2013, 2013 IEEE International Conference on Computer Vision.

[39] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[40] Frédo Durand,et al. Efficient marginal likelihood optimization in blind deconvolution , 2011, CVPR 2011.

[41] Silvio Savarese,et al. Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes , 2016, ECCV.

[42] Stephen P. Boyd,et al. Dirty Pixels: Optimizing Image Classification Architectures for Raw Sensor Data , 2017, ArXiv.

[43] Jian Yang,et al. MemNet: A Persistent Memory Network for Image Restoration , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[44] Graham W. Taylor,et al. Adaptive deconvolutional networks for mid and high level feature learning , 2011, 2011 International Conference on Computer Vision.

[45] Graham W. Taylor,et al. Deconvolutional networks , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[46] Sridha Sridharan,et al. Super-Resolved Faces for Improved Face Recognition from Surveillance Video , 2007, ICB.

[47] Qinghua Hu,et al. Vision Meets Drones: A Challenge , 2018, ArXiv.

[48] Jani Lainema,et al. Adaptive deblocking filter , 2003, IEEE Trans. Circuits Syst. Video Technol..

[49] Shuicheng Yan,et al. Deep Edge Guided Recurrent Residual Learning for Image Super-Resolution , 2016, IEEE Transactions on Image Processing.

[50] Chiou-Shann Fuh,et al. Tone Reproduction: A Perspective from Luminance-Driven Perceptual Grouping , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[51] Thomas S. Huang,et al. Image super-resolution as sparse representation of raw image patches , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[52] Alexei A. Efros,et al. Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[53] N. Prins. Psychophysics: A Practical Introduction , 2009 .

[54] Michael Elad,et al. On Single Image Scale-Up Using Sparse-Representations , 2010, Curves and Surfaces.

[55] Frédo Durand,et al. Image and depth from a conventional camera with a coded aperture , 2007, ACM Trans. Graph..

[56] Sridha Sridharan,et al. Face recognition from super-resolved images , 2005, Proceedings of the Eighth International Symposium on Signal Processing and Its Applications, 2005..

[57] Edward J. Delp,et al. Quality-adaptive deep learning for pedestrian detection , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[58] Michael Shao,et al. Optically coherent image formation and denoising using a plug and play inversion framework. , 2017, Applied optics.

[59] Wei Xu,et al. Deep Joint Face Hallucination and Recognition , 2016, ArXiv.

[60] Andrea Vedaldi,et al. Instance Normalization: The Missing Ingredient for Fast Stylization , 2016, ArXiv.

[61] Luc Van Gool,et al. Anchored Neighborhood Regression for Fast Example-Based Super-Resolution , 2013, 2013 IEEE International Conference on Computer Vision.

[62] Kyung-Ah Sohn,et al. Enhancing the Performance of Convolutional Neural Networks on Quality Degraded Datasets , 2017, 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[63] Todd M. Gureckis,et al. Evaluating Amazon's Mechanical Turk as a Tool for Experimental Behavioral Research , 2013, PloS one.

[64] Alessandro Foi,et al. Image Denoising by Sparse 3-D Transform-Domain Collaborative Filtering , 2007, IEEE Transactions on Image Processing.

[65] Jizheng Xu,et al. AOD-Net: All-in-One Dehazing Network , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[66] Xiaoou Tang,et al. Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.

[67] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[68] Kyoung Mu Lee,et al. Accurate Image Super-Resolution Using Very Deep Convolutional Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[70] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[71] Robert B. Fisher,et al. The PETS04 Surveillance Ground-Truth Data Sets , 2004 .

[72] Raanan Fattal,et al. Image and video upscaling from local self-examples , 2011, TOGS.

[73] Raymond Y. K. Lau,et al. Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[74] Deva Ramanan,et al. Efficiently Scaling up Crowdsourced Video Annotation , 2012, International Journal of Computer Vision.

[75] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76] Narendra Ahuja,et al. Single image super-resolution from transformed self-exemplars , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[77] Pablo H. Hennings-Yeomans,et al. Simultaneous super-resolution and feature extraction for recognition of low-resolution faces , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[78] Thomas S. Huang,et al. Close the loop: Joint blind image restoration and recognition with sparse representation prior , 2011, 2011 International Conference on Computer Vision.

[79] Jean-Philippe Tarel,et al. Fast visibility restoration from a single color or gray level image , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[80] Xiaochun Cao,et al. Single Image Deraining: A Comprehensive Benchmark Analysis , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[81] Chun-Liang Li,et al. One Network to Solve Them All — Solving Linear Inverse Problems Using Deep Projection Models , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[82] Xiaogang Wang,et al. Scene-Independent Group Profiling in Crowd , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[83] Takeo Kanade,et al. Analysis of Rain and Snow in Frequency Space , 2008, International Journal of Computer Vision.

[84] Robby T. Tan,et al. Visibility in bad weather from a single image , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[85] Kari Pulli,et al. FlexISP , 2014, ACM Trans. Graph..

[86] Jie Chen,et al. Robust Video Content Alignment and Compensation for Rain Removal in a CNN Framework , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[87] R. Keys. Cubic convolution interpolation for digital image processing , 1981 .

[88] Ming Yang,et al. DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[89] Baowen Xu,et al. Super-resolution Person re-identification with semi-coupled low-rank discriminant dictionary learning , 2015, CVPR.

[90] Michael S. Brown,et al. Classification-Driven Dynamic Image Enhancement , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[91] Qi Tian,et al. The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking , 2018, ECCV.

[92] Cambridge,et al. Lucky imaging: High angular resolution imaging in the visible from the ground , 2005, astro-ph/0507299.

[93] Luc Van Gool,et al. Systematic evaluation of super-resolution using classification , 2011, 2011 Visual Communications and Image Processing (VCIP).

[94] Tae Hyun Kim,et al. Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[95] Harry Shum,et al. Full-frame video stabilization with motion inpainting , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[96] Hongyang Chao,et al. One-To-Many Network for Visually Pleasing Compression Artifacts Reduction , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[97] Mayank Vatsa,et al. Dual Directed Capsule Network for Very Low Resolution Image Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[98] Guillermo Sapiro,et al. Deep Video Deblurring , 2016, ArXiv.

[99] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[100] Jae Lim,et al. Reduction Of Blocking Effects In Image Coding , 1984 .

[101] Hua Huang,et al. Super-Resolution Method for Face Recognition Using Nonlinear Mappings on Coherent Features , 2011, IEEE Transactions on Neural Networks.

[102] Aline Roumy,et al. Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding , 2012, BMVC.

[103] Benjamin Z. Yao,et al. Introduction to a Large-Scale General Purpose Ground Truth Database: Methodology, Annotation Tool and Benchmarks , 2007, EMMCVPR.

[104] Yi Yao,et al. Improving long range and high magnification face recognition: Database acquisition, evaluation, and enhancement , 2008, Comput. Vis. Image Underst..

[105] Deqing Sun,et al. Blind Image Deblurring Using Dark Channel Prior , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[106] Anat Levin,et al. Natural image denoising: Optimality and inherent bounds , 2011, CVPR 2011.

[107] Zhangyang Wang,et al. Report on UG^2+ Challenge Track 1: Assessing Algorithms to Improve Video Object Detection and Classification from Unconstrained Mobility Platforms , 2021, Computer Vision and Image Understanding.

[108] Subhasis Chaudhuri,et al. Blind Image Deconvolution , 2014, Springer International Publishing.

[109] Yu-Bin Yang,et al. Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections , 2016, NIPS.

[110] Alexei A. Efros,et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[111] Radha Poovendran,et al. Google's Cloud Vision API is Not Robust to Noise , 2017, 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA).

[112] Bernhard Schölkopf,et al. The Unreasonable Effectiveness of Texture Transfer for Single Image Super-resolution , 2018, ECCV Workshops.

[113] Anat Levin,et al. Accurate Blur Models vs. Image Priors in Single Image Super-resolution , 2013, 2013 IEEE International Conference on Computer Vision.

[114] Jonathan T. Barron,et al. Deep bilateral learning for real-time image enhancement , 2017, ACM Trans. Graph..

[115] Frédo Durand,et al. Reflection removal using ghosting cues , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[116] Yu-Bin Yang,et al. Image Restoration Using Convolutional Auto-encoders with Symmetric Skip Connections , 2016, ArXiv.

[117] Lina J. Karam,et al. Understanding how image quality affects deep neural networks , 2016, 2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX).

[118] Xiaoou Tang,et al. Compression Artifacts Reduction by a Deep Convolutional Network , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[119] Lei Zhang,et al. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[120] Karel J. Zuiderveld,et al. Contrast Limited Adaptive Histogram Equalization , 1994, Graphics Gems.

[121] Alexei A. Efros,et al. Colorful Image Colorization , 2016, ECCV.

[122] Seungyong Lee,et al. Fast motion deblurring , 2009, ACM Trans. Graph..

[123] Bernhard Schölkopf,et al. EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[124] Walter J. Scheirer,et al. PsyPhy: A Psychophysics Driven Evaluation Framework for Visual Recognition , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[125] Raanan Fattal,et al. Single image dehazing , 2008, ACM Trans. Graph..

[126] Gregory Shakhnarovich,et al. Task-Driven Super Resolution: Object Detection in Low-resolution Images , 2018, ICONIP.

[127] Mislav Grgic,et al. SCface – surveillance cameras face database , 2011, Multimedia Tools and Applications.

[128] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[129] Yochai Blau,et al. The Perception-Distortion Tradeoff , 2017, CVPR.

[130] Hao Li,et al. Rain Removal in Video by Combining Temporal and Chromatic Properties , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[131] Karen O. Egiazarian,et al. Pointwise Shape-Adaptive DCT for High-Quality Denoising and Deblocking of Grayscale and Color Images , 2007, IEEE Transactions on Image Processing.