Object Detection Under Challenging Lighting Conditions Using High Dynamic Range Imagery

Most Convolution Neural Network (CNN) based object detectors, to date, have been optimized for accuracy and/or detection performance on datasets typically comprised of well exposed 8-bits/pixel/channel Standard Dynamic Range (SDR) images. A major existing challenge in this area is to accurately detect objects under extreme/difficult lighting conditions as SDR image trained detectors fail to accurately detect objects under such challenging lighting conditions. In this paper, we address this issue for the first time by introducing High Dynamic Range (HDR) imaging to object detection. HDR imagery can capture and process ≈13 orders of magnitude of scene dynamic range similar to the human eye. HDR trained models are therefore able to extract more salient features from extreme lighting conditions leading to more accurate detections. However, introducing HDR also presents multiple new challenges such as the complete absence of resources and previous literature on such an approach. Here, we introduce a methodology to generate a large scale annotated HDR dataset from any existing SDR dataset and validate the quality of the generated dataset via a robust evaluation technique. We also discuss the challenges of training and validating HDR trained models using existing detectors. Finally, we provide a methodology to create an out of distribution (OOD) HDR dataset to test and compare the performance of HDR and SDR trained detectors under difficult lighting condition. Results suggest that using the proposed methodology, HDR trained models are able to achieve 10 – 12% more accuracy compared to SDR trained models on real-world OOD dataset consisting of high-contrast images under extreme lighting conditions.

[1]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2]  Matti Pietikäinen,et al.  Deep Learning for Generic Object Detection: A Survey , 2018, International Journal of Computer Vision.

[3]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[4]  Giuseppe Valenzise,et al.  Evaluation of Feature Detection in HDR Based Imaging Under Changes in Illumination Conditions , 2015, 2015 IEEE International Symposium on Multimedia (ISM).

[5]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[6]  Mark D. Fairchild,et al.  iCAM06: A refined image appearance model for HDR image rendering , 2007, J. Vis. Commun. Image Represent..

[7]  Fan Yang,et al.  Physiological inverse tone mapping based on retina response , 2013, The Visual Computer.

[8]  Allan G. Rempel,et al.  Ldr2Hdr: on-the-fly reverse tone mapping of legacy video and photographs , 2007, ACM Trans. Graph..

[9]  Alan Chalmers,et al.  Backward Compatible Object Detection Using HDR Image Content , 2020, IEEE Access.

[10]  Rafal Mantiuk,et al.  Display adaptive tone mapping , 2008, ACM Trans. Graph..

[11]  Hao Chen,et al.  FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  Gabriel Eilertsen,et al.  HDR image reconstruction from a single exposure using deep CNNs , 2017, ACM Trans. Graph..

[13]  John K. Tsotsos,et al.  50 Years of object recognition: Directions forward , 2013, Comput. Vis. Image Underst..

[14]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[15]  Kurt Debattista,et al.  Advanced High Dynamic Range Imaging: Theory and Practice , 2011 .

[16]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[17]  Patrick J. Wolfe,et al.  Optimal exposure control for high dynamic range imaging , 2010, 2010 IEEE International Conference on Image Processing.

[18]  Francesco Banterle,et al.  Inverse tone mapping , 2006, GRAPHITE '06.

[19]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[20]  Xiangyu Zhang,et al.  Light-Head R-CNN: In Defense of Two-Stage Object Detector , 2017, ArXiv.

[21]  Erik Reinhard,et al.  Do HDR displays support LDR content?: a psychophysical evaluation , 2007, ACM Trans. Graph..

[22]  Jieping Ye,et al.  Object Detection in 20 Years: A Survey , 2019, Proceedings of the IEEE.

[23]  Yoshihiro Kanamori,et al.  Deep reverse tone mapping , 2017, ACM Trans. Graph..

[24]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[25]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[26]  Scott Miller,et al.  Perceptual Signal Coding for More Efficient Usage of Bit Codes , 2012 .

[27]  Hei Law,et al.  CornerNet: Detecting Objects as Paired Keypoints , 2018, ECCV.

[28]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[29]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[30]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.

[32]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Erik Reinhard,et al.  Photographic tone reproduction for digital images , 2002, ACM Trans. Graph..

[34]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Erik Reinhard,et al.  High Dynamic Range Imaging: Acquisition, Display, and Image-Based Lighting , 2010 .

[37]  Thomas Bashford-Rogers,et al.  ExpandNet: A Deep Convolutional Neural Network for High Dynamic Range Expansion from Low Dynamic Range Content , 2018, Comput. Graph. Forum.

[38]  Manuel Menezes de Oliveira Neto,et al.  High-Quality Reverse Tone Mapping for a Wide Range of Exposures , 2014, 2014 27th SIBGRAPI Conference on Graphics, Patterns and Images.

[39]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.