Pose Recognition for Dense Vehicles under Complex Street Scenario

Locating vehicles in the surrounding environment and distinguishing their poses are crucial in autonomous driving systems. An intelligent transportation system expects a vision-based object detector not only to detect and classify vehicles but also to recognize the poses of them in estimating their future movement direction. However, most of the current object detection algorithms only classify different classes of vehicles; and there are rare studies on classifying different poses within the same class of vehicles. In addition, the currently available datasets have no annotation data that are specific to different poses of intra-class vehicles. To achieve the vehicles pose recognition, we build a vehicle pose dataset based on the Cityscapes dataset. Furthermore, we also improve the network model of the YOLO v2 detector to support simultaneous vehicle detection and pose classification with an improvement of 10.71% mean average precision (mAP) comparing to the default network which runs in real-time with 29 FPS.

[1]  Hakil Kim,et al.  On-road object detection using deep neural network , 2016, 2016 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia).

[2]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[3]  Eryk Dutkiewicz,et al.  Cost-Effective Foliage Penetration Human Detection Under Severe Weather Conditions Based on Auto-Encoder/Decoder Neural Network , 2019, IEEE Internet of Things Journal.

[4]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[8]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Sebastian Thrun,et al.  Towards fully autonomous driving: Systems and algorithms , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[10]  Mohan M. Trivedi,et al.  Multipart Vehicle Detection Using Symmetry-Derived Analysis and Active Learning , 2016, IEEE Transactions on Intelligent Transportation Systems.

[11]  Xian-Sheng Hua,et al.  Multi-Task Vehicle Detection With Region-of-Interest Voting , 2018, IEEE Transactions on Image Processing.

[12]  Andreas Geiger,et al.  Computer Vision for Autonomous Vehicles: Problems, Datasets and State-of-the-Art , 2017, Found. Trends Comput. Graph. Vis..

[13]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[14]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .