Low-light pedestrian detection from RGB images using multi-modal knowledge distillation

While deep learning based pedestrian detection systems have continued to scale new heights in recent times, the performance of such algorithms tends to degrade under challenging illumination conditions. This causes a bottleneck in ready portability of such systems to Advanced Driver Assistance Systems (ADAS), where consistent performance across varying environmental lighting is desired. Inspired by the concept of dark knowledge, this paper proposes a novel guided deep network that distills knowledge from a multi-modal pedestrian detector. The proposed network learns to extract both RGB and thermal-like features from RGB images alone, thus compensating for the requirement of significantly costly automotive-grade thermal cameras. Compelling detection performance in severe lighting conditions is demonstrated on a publicly available night-time pedestrian dataset — KAIST. We achieve an effective miss-rate of 12% lower than the recent state-of-the-art methods.

[1]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[2]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  Liang Lin,et al.  Is Faster R-CNN Doing Well for Pedestrian Detection? , 2016, ECCV.

[4]  Bernt Schiele,et al.  Ten Years of Pedestrian Detection, What Have We Learned? , 2014, ECCV Workshops.

[5]  Sven Behnke,et al.  Multispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks , 2016, ESANN.

[6]  Iasonas Kokkinos,et al.  Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[7]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[9]  Martial Hebert,et al.  Dense Optical Flow Prediction from a Static Image , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jianfei Dong,et al.  Nighttime Pedestrian Detection with Near Infrared using Cascaded Classifiers , 2007, 2007 IEEE International Conference on Image Processing.

[12]  Rogério Schmidt Feris,et al.  A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[13]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Shu Wang,et al.  Multispectral Deep Neural Networks for Pedestrian Detection , 2016, BMVC.

[15]  Ying Cui,et al.  Real-time human detection and tracking in complex environments using single RGBD camera , 2013, 2013 IEEE International Conference on Image Processing.

[16]  Vishnu Naresh Boddeti,et al.  In Teacher We Trust: Learning Compressed Models for Pedestrian Detection , 2016, ArXiv.

[17]  Namil Kim,et al.  Multispectral pedestrian detection: Benchmark dataset and baseline , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  David A. Forsyth,et al.  Rendering synthetic objects into legacy photographs , 2011, ACM Trans. Graph..