A large-scale car dataset for fine-grained categorization and verification

This paper aims to highlight vision related tasks centered around “car”, which has been largely neglected by vision community in comparison to other objects. We show that there are still many interesting car-related problems and applications, which are not yet well explored and researched. To facilitate future car-related research, in this paper we present our on-going effort in collecting a large-scale dataset, “CompCars”, that covers not only different car views, but also their different internal and external parts, and rich attributes. Importantly, the dataset is constructed with a cross-modality nature, containing a surveillance-nature set and a web-nature set. We further demonstrate a few important applications exploiting the dataset, namely car model classification, car model verification, and attribute prediction. We also discuss specific challenges of the car-related problems and other potential applications that worth further investigations. The latest dataset can be downloaded at http://mmlab.ie.cuhk.edu.hk/ datasets/comp_cars/index.html.

[1]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[2]  Richard Szeliski,et al.  Car make and model recognition using 3D curve alignment , 2014, IEEE Winter Conference on Applications of Computer Vision.

[3]  Konrad Schindler,et al.  Are Cars Just 3D Boxes? Jointly Estimating the 3D Shape of Multiple Objects , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Tieniu Tan,et al.  Three-Dimensional Deformable-Model-Based Localization and Recognition of Road Vehicles , 2012, IEEE Transactions on Image Processing.

[5]  David W. Jacobs,et al.  Dog Breed Classification Using Part Localization , 2012, ECCV.

[6]  Xiaogang Wang,et al.  Multi-View Perceptron: a Deep Model for Learning Face Identity and View Representations , 2014, NIPS.

[7]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[8]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[9]  Yong Jae Lee,et al.  Style-Aware Mid-level Representation for Discovering Visual Connections in Space and Time , 2013, 2013 IEEE International Conference on Computer Vision.

[10]  Eleftherios Kayafas,et al.  Vehicle Logo Recognition Using a SIFT-Based Enhanced Matching Scheme , 2010, IEEE Transactions on Intelligent Transportation Systems.

[11]  Xiaoou Tang,et al.  Pedestrian Attribute Recognition At Far Distance , 2014, ACM Multimedia.

[12]  Supun Samarasekera,et al.  Vehicle tracking across nonoverlapping cameras using joint kinematic and appearance features , 2011, CVPR 2011.

[13]  Subhransu Maji,et al.  Fine-Grained Visual Classification of Aircraft , 2013, ArXiv.

[14]  Jonathan Krause,et al.  3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[15]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[16]  Xiaoou Tang,et al.  Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[17]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[18]  Jian Sun,et al.  Bayesian Face Revisited: A Joint Formulation , 2012, ECCV.

[19]  Kun He,et al.  Parameterizing Object Detectors in the Continuous Pose Space , 2014, ECCV.

[20]  Xiaoou Tang,et al.  Object Detection and Viewpoint Estimation with Auto-masking Neural Network , 2014, ECCV.

[21]  Larry S. Davis,et al.  Jointly Optimizing 3D Model Fitting and Fine-Grained Classification , 2014, ECCV.

[22]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[23]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Linjie Yangl,et al.  for Fine-Grained Categorization and Verification , 2015 .

[25]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[26]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[27]  Silvio Savarese,et al.  Monocular Multiview Object Tracking with 3D Aspect Parts , 2014, ECCV.

[28]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Trevor Darrell,et al.  PANDA: Pose Aligned Networks for Deep Attribute Modeling , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Zehang Sun,et al.  On-road vehicle detection: a review , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Subhransu Maji,et al.  Describing people: A poselet-based approach to attribute classification , 2011, 2011 International Conference on Computer Vision.

[32]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[33]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.