Class Activation Mapping-Based Car Saliency Region and Detection for In-Vehicle Surveillance

This paper describes a class of problem where a detection and localization of car are addressed simultaneously using in-vehicle camera. While the current state-of-the-art researches have tackled the detection problems, in fact, there several rooms for the improvement. One of them is to optimize the detection processes. In this research, we introduce the use of activation map obtained from the last layer of the convolutional network to localize the saliency region of the car and to perform detection of the car objects by collaborating deep architecture trained weight. Our study also finds the benefit of changing the last layer of the network, from fully-connected layer to Global Average Pooling (GAP), which alleviates the overfitting problem. Experiment results on various optimization methods show the impact of the performance, where the proposed GAP-based deep learning and ADAM optimization gives the best results, up to 99.83%. It imply the possibilities of utilizing our method for real case usage scenario.

[1]  David P. Luebke,et al.  CUDA: Scalable parallel programming for high-performance scientific computing , 2008, 2008 5th IEEE International Symposium on Biomedical Imaging: From Nano to Macro.

[2]  Jonathan Krause,et al.  3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[3]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4]  John D. Hunter,et al.  Matplotlib: A 2D Graphics Environment , 2007, Computing in Science & Engineering.

[5]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[6]  Sven Behnke,et al.  Large-scale object recognition with CUDA-accelerated hierarchical neural networks , 2009, 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems.

[7]  P. Fua,et al.  Pose estimation for category specific multiview object localization , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Andrzej Glowacz,et al.  The efficient real- and non-real-time make and model recognition of cars , 2013, Multimedia Tools and Applications.

[10]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Gaël Varoquaux,et al.  The NumPy Array: A Structure for Efficient Numerical Computation , 2011, Computing in Science & Engineering.

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  John Tran,et al.  cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.

[14]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[15]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[16]  Eros Gian Alessandro Pasero,et al.  EEG Based Eye State Classification using Deep Belief Network and Stacked AutoEncoder , 2016 .

[17]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.