Vehicle recognition using multi-task cascaded network

Vehicle attribute recognition mainly contains two tasks: vehicle object location and vehicle category recognition. We propose a multi-task cascaded model MC-CNN, which integrates the improved Faster R-CNN and CNN. The first stage uses the improved Faster R-CNN network (IFR-CNN) to process the object location, and the second stage uses the improved CNN network (ICNN) to realize the object recognition. In IFR-CNN sub network, a max pooling and the deconvolution operation are added to the shallow layers of Faster R-CNN network. IFR-CNN can extract features from the different levels and increase the location information of shallow object. In ICNN sub network, we optimize the information extraction ability of high-level semantics in the middle layers and the deep layers of CNN network. The experimental results show that MC-CNN network proposed in this paper has better attribute recognition accuracy on BIT-Vehicle dataset and SYIT-Vehicle dataset than the single Faster R-CNN and CNN network models.

[1]  Takeo Kanade,et al.  Type classification, color estimation, and specific target detection of moving targets on public streets , 2005, Machine Vision and Applications.

[2]  Tieniu Tan,et al.  Model-Based Localisation and Recognition of Road Vehicles , 1998, International Journal of Computer Vision.

[3]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[4]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[6]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2015, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Fuchun Sun,et al.  RON: Reverse Connection with Objectness Prior Networks for Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[9]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[12]  Lisa M. Brown,et al.  A closer look at Faster R-CNN for vehicle detection , 2016, 2016 IEEE Intelligent Vehicles Symposium (IV).

[13]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.