Remote detection of idling cars using infrared imaging and deep networks

Idling vehicles waste energy and pollute the environment through exhaust emission. In some countries, idling a vehicle for more than a predefined duration is prohibited and automatic idling vehicle detection is desirable for law enforcement. We propose the first automatic system to detect idling cars, using infrared (IR) imaging and deep networks. We rely on the differences in spatio-temporal heat signatures of idling and stopped cars and monitor the car temperature with a long-wavelength IR camera. We formulate the idling car detection problem as spatio-temporal event detection in IR image sequences and employ deep networks for spatio-temporal modeling. We collected the first IR image sequence dataset for idling car detection. First, we detect the cars in each IR image using a convolutional neural network, which is pre-trained on regular RGB images and fine-tuned on IR images for higher accuracy. Then, we track the detected cars over time to identify the cars that are parked. Finally, we use the 3D spatio-temporal IR image volume of each parked car as input to convolutional and recurrent networks to classify them as idling or not. We carried out an extensive empirical evaluation of temporal and spatio-temporal modeling approaches with various convolutional and recurrent architectures. We present promising experimental results on our IR image sequence dataset.

[1]  Xiuwei Zhang,et al.  Visible and infrared image registration based on region features and edginess , 2017, Machine Vision and Applications.

[2]  Ting Yao,et al.  Deep Learning for Video Classification and Captioning , 2016, Frontiers of Multimedia Research.

[3]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[4]  Aly A. Farag,et al.  A Fully Automatic Method to Extract the Heart Rate from Thermal Video , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[5]  Thomas B. Moeslund,et al.  Thermal cameras and applications: a survey , 2013, Machine Vision and Applications.

[6]  Forrest N. Iandola,et al.  SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[7]  Richard Socher,et al.  Improving Generalization Performance by Switching from Adam to SGD , 2017, ArXiv.

[8]  김종영 구글 TensorFlow 소개 , 2015 .

[9]  Saurabh Singh,et al.  Face recognition by fusing thermal infrared and visible imagery , 2006, Image Vis. Comput..

[10]  Qiong Liu,et al.  Transferred IR pedestrian detector toward distinct scenarios adaptation , 2015, Neural Computing and Applications.

[11]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Sergio Guadarrama,et al.  Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Lap-Pui Chau,et al.  Idling Car Detection with ConvNets in Infrared Image Sequences , 2018, 2018 IEEE International Symposium on Circuits and Systems (ISCAS).

[16]  Sungho Kim Analysis of small infrared target features and learning-based false detection removal for infrared search and track , 2013, Pattern Analysis and Applications.

[17]  Kate Saenko,et al.  R-C3D: Region Convolutional 3D Network for Temporal Activity Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[19]  Atsuto Maki,et al.  From generic to specific deep representations for visual recognition , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[20]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[21]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[22]  Moulay A. Akhloufi,et al.  Thermal Faceprint: A New Thermal Face Signature Extraction for Infrared Face Recognition , 2008, 2008 Canadian Conference on Computer and Robot Vision.

[23]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[24]  Uwe Stilla,et al.  Car detection in aerial thermal images by local and global evidence accumulation , 2006, Pattern Recognit. Lett..

[25]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[26]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[27]  Massimo Bertozzi,et al.  Pedestrian detection by means of far-infrared stereo vision , 2007, Comput. Vis. Image Underst..

[28]  Eric Rask,et al.  Which Is Greener: Idle, or Stop and Restart? Comparing Fuel Use and Emissions for Short Passenger-Car Stops , 2013 .

[29]  Trevor Darrell,et al.  Long-term recurrent convolutional networks for visual recognition and description , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Michael Vollmer,et al.  Infrared Thermal Imaging: Fundamentals, Research and Applications , 2010 .

[31]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Luís A. Alexandre,et al.  Algorithms for invariant long-wave infrared face segmentation: evaluation and comparison , 2014, Pattern Analysis and Applications.

[33]  Ghassan Al-Regib,et al.  TS-LSTM and Temporal-Inception: Exploiting Spatiotemporal Dynamics for Activity Recognition , 2017, Signal Process. Image Commun..

[34]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[36]  Shih-Fu Chang,et al.  Frontiers of Multimedia Research , 2018 .

[37]  Sylvain Arlot,et al.  A survey of cross-validation procedures for model selection , 2009, 0907.4728.

[38]  Joon Son Chung,et al.  Lip Reading Sentences in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Mohamed Hammami,et al.  Fusion of thermal infrared and visible spectra for robust moving object detection , 2017, Pattern Analysis and Applications.

[40]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[41]  Daniel Bodansky,et al.  The Paris Climate Change Agreement: A New Hope? , 2016, American Journal of International Law.

[42]  Rui Hou,et al.  Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).