论文信息 - BoxCars: Improving Fine-Grained Recognition of Vehicles Using 3-D Bounding Boxes in Traffic Surveillance

BoxCars: Improving Fine-Grained Recognition of Vehicles Using 3-D Bounding Boxes in Traffic Surveillance

In this paper, we focus on fine-grained recognition of vehicles mainly in traffic surveillance applications. We propose an approach that is orthogonal to recent advancements in fine-grained recognition (automatic part discovery and bilinear pooling). In addition, in contrast to other methods focused on fine-grained recognition of vehicles, we do not limit ourselves to a frontal/rear viewpoint, but allow the vehicles to be seen from any viewpoint. Our approach is based on 3-D bounding boxes built around the vehicles. The bounding box can be automatically constructed from traffic surveillance data. For scenarios where it is not possible to use precise construction, we propose a method for an estimation of the 3-D bounding box. The 3-D bounding box is used to normalize the image viewpoint by “unpacking” the image into a plane. We also propose to randomly alter the color of the image and add a rectangle with random noise to a random position in the image during the training of convolutional neural networks (CNNs). We have collected a large fine-grained vehicle data set BoxCars116k, with 116k images of vehicles from various viewpoints taken by numerous surveillance cameras. We performed a number of experiments, which show that our proposed method significantly improves CNN classification accuracy (the accuracy is increased by up to 12% points and the error is reduced by up to 50% compared with CNNs without the proposed modifications). We also show that our method outperforms the state-of-the-art methods for fine-grained recognition.

[1] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] Simant Prakoonwit,et al. Car make and model recognition under limited lighting conditions at night , 2017, Pattern Analysis and Applications.

[4] Jonathan Krause,et al. Fine-grained recognition without part annotations , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Eleftherios Kayafas,et al. Vehicle model recognition from frontal view image measurements , 2011, Comput. Stand. Interfaces.

[6] Maurice Milgram,et al. Multi-class Vehicle Type Recognition System , 2008, ANNPR.

[7] Cewu Lu,et al. Deep LAC: Deep localization, alignment and classification for fine-grained recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Luc Van Gool,et al. Hough Forests for Object Detection, Tracking, and Action Recognition , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Qi Tian,et al. Fine-grained visual categorization with fine-tuned segmentation , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[11] W. Eric L. Grimson,et al. Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[12] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[13] Gérard G. Medioni,et al. 3-D model based vehicle recognition , 2009, 2009 Workshop on Applications of Computer Vision (WACV).

[14] David A. McAllester,et al. Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Ming Yang,et al. DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[16] Andrzej Glowacz,et al. The efficient real- and non-real-time make and model recognition of cars , 2013, Multimedia Tools and Applications.

[17] Linda G. Shapiro,et al. Unsupervised Template Learning for Fine-Grained Object Recognition , 2012, NIPS.

[18] Wu Liu,et al. Large-scale vehicle re-identification in urban surveillance videos , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).

[19] Arnold W. M. Smeulders,et al. Local Alignments for Fine-Grained Categorization , 2014, International Journal of Computer Vision.

[20] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21] Adam Herout,et al. BoxCars: 3D Boxes as CNN Input for Improved Fine-Grained Vehicle Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Jonathan Krause,et al. Learning Features and Parts for Fine-Grained Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.

[23] Ignacio Parra,et al. Vehicle model recognition using geometry and appearance of car emblems from rear view images , 2014, 17th International IEEE Conference on Intelligent Transportation Systems (ITSC).

[24] C. V. Jawahar,et al. Cats and dogs , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25] Honglak Lee,et al. Object Contour Detection with a Fully Convolutional Encoder-Decoder Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Xiang Bai,et al. Learning Discriminative Pattern for Real-Time Car Brand Recognition , 2015, IEEE Transactions on Intelligent Transportation Systems.

[27] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28] Luc Van Gool,et al. Deep Expectation of Real and Apparent Age from a Single Image Without Facial Landmarks , 2016, International Journal of Computer Vision.

[29] Yang Gao,et al. Compact Bilinear Pooling , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Richard Szeliski,et al. Car make and model recognition using 3D curve alignment , 2014, IEEE Winter Conference on Applications of Computer Vision.

[31] Yihong Gong,et al. Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32] Feng Zhou,et al. Fine-Grained Image Classification by Exploring Bipartite-Graph Labels , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Charless C. Fowlkes,et al. Bilinear classifiers for visual recognition , 2009, NIPS.

[34] Z. Zivkovic. Improved adaptive Gaussian mixture model for background subtraction , 2004, ICPR 2004.

[35] Larry S. Davis,et al. Jointly Optimizing 3D Model Fitting and Fine-Grained Classification , 2014, ECCV.

[36] Nick Pears,et al. Automatic make and model recognition from frontal images of cars , 2011, 2011 8th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[37] Xiaoou Tang,et al. A large-scale car dataset for fine-grained categorization and verification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Tianbao Yang,et al. Hyper-class augmented and regularized deep learning for fine-grained image classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39] Bailing Zhang,et al. Reliable Classification of Vehicle Types Based on Cascade Classifier Ensembles , 2013, IEEE Transactions on Intelligent Transportation Systems.

[40] Adam Herout,et al. Automatic Camera Calibration for Traffic Understanding , 2014, BMVC.

[41] Luc Van Gool,et al. TriCoS: A Tri-level Class-Discriminative Co-segmentation Method for Image Classification , 2012, ECCV.

[42] Hong Wang,et al. Evolving boxes for fast vehicle detection , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[43] Gary R. Bradski,et al. A codebook-free and annotation-free approach for fine-grained image categorization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[44] Jonathan Krause,et al. Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States , 2017, Proceedings of the National Academy of Sciences.

[45] Adam Herout,et al. Fully Automatic Roadside Camera Calibration for Traffic Surveillance , 2015, IEEE Transactions on Intelligent Transportation Systems.

[46] Jun-Wei Hsieh,et al. Symmetrical SURF and Its Applications to Vehicle Detection and Vehicle Make and Model Recognition , 2014, IEEE Transactions on Intelligent Transportation Systems.

[47] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[48] Jeonghwan Gwak,et al. Vehicle Model Recognition in Video , 2013 .

[49] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[50] Zoran Zivkovic,et al. Improved adaptive Gaussian mixture model for background subtraction , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[51] Qi Tian,et al. Picking Deep Filter Responses for Fine-Grained Image Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52] James J. Little,et al. Fine-Grained Categorization for 3D Scene Understanding , 2012, BMVC.

[53] James H. Elder,et al. Slot Cars: 3D Modelling for Improved Visual Traffic Analytics , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[54] Kun Duan,et al. Discovering localized attributes for fine-grained recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[55] Bailing Zhang. Classification and identification of vehicle type and make by cortex-like image descriptor HMAX , 2014, Int. J. Comput. Vis. Robotics.

[56] Jonathan Krause,et al. 3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[57] Marcel Simon,et al. Neural Activation Constellations: Unsupervised Part Model Discovery with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[58] Subhransu Maji,et al. Bilinear CNN Models for Fine-Grained Visual Recognition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[59] Xuelong Li,et al. Detecting Densely Distributed Graph Patterns for Fine-Grained Image Categorization , 2016, IEEE Transactions on Image Processing.

[60] Tiejun Huang,et al. Deep Relative Distance Learning: Tell the Difference between Similar Vehicles , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61] Serge J. Belongie,et al. Recognizing Cars , 2005 .

[62] Noah Snavely,et al. NYC3DCars: A Dataset of 3D Vehicles in Geographic Context , 2013, 2013 IEEE International Conference on Computer Vision.

[63] Suh-Yin Lee,et al. Car model recognition by utilizing symmetric property to overcome severe pose variation , 2012, Machine Vision and Applications.

[64] Pavel Zemcík,et al. Real-Time Pose Estimation Piggybacked on Object Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[65] Trevor Darrell,et al. Pose pooling kernels for sub-category recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[66] Huibing Wang,et al. Deep CNNs With Spatially Weighted Pooling for Fine-Grained Car Recognition , 2017, IEEE Transactions on Intelligent Transportation Systems.

[67] Pietro Perona,et al. Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[68] Andrew Zisserman,et al. Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[69] Ya Zhang,et al. Part-Stacked CNN for Fine-Grained Visual Categorization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[70] Timothy F. Cootes,et al. Analysis of Features for Rigid Structure Vehicle Type Recognition , 2004, BMVC.

[71] G. Salvi,et al. An Automated Nighttime Vehicle Counting and Detection System for Traffic Surveillance , 2014, 2014 International Conference on Computational Science and Computational Intelligence.

[72] Jindong Tan,et al. Recognition of Car Makes and Models From a Single Traffic-Camera Image , 2015, IEEE Transactions on Intelligent Transportation Systems.

[73] Qi Wang,et al. Exploiting effects of parts in fine-grained categorization of vehicles , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[74] David A. McAllester,et al. Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[75] Yu Zhou,et al. Fine-Grained Vehicle Model Recognition Using A Coarse-to-Fine Convolutional Neural Network Architecture , 2017, IEEE Transactions on Intelligent Transportation Systems.