Vehicle Re-identification: an Efficient Baseline Using Triplet Embedding

In this paper we tackle the problem of vehicle re-identification in a camera network utilizing triplet embeddings. Re-identification is the problem of matching appearances of objects across different cameras. With the proliferation of surveillance cameras enabling smart and safer cities, there is an ever-increasing need to re-identify vehicles across cameras. Typical challenges arising in smart city scenarios include variations of viewpoints, illumination and self occlusions. Most successful approaches for re-identification involve (deep) learning an embedding space such that the vehicles of same identities are projected closer to one another, compared to the vehicles representing different identities. Popular loss functions for learning an embedding (space) include contrastive or triplet loss. In this paper we provide an extensive evaluation of these losses applied to vehicle re-identification and demonstrate that using the best practices for learning embeddings outperform most of the previous approaches proposed in the vehicle re-identification literature. Compared to most existing state-of-the-art approaches, our approach is simpler and more straightforward for training utilizing only identity-level annotations, along with one of the smallest published embedding dimensions for efficient inference. Furthermore in this work we introduce a formal evaluation of a triplet sampling variant (batch sample) into the re-identification literature.

[1]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[2]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[3]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[4]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Fatih Murat Porikli,et al.  Region Covariance: A Fast Descriptor for Detection and Classification , 2006, ECCV.

[6]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Bingpeng Ma,et al.  BiCov: a novel image representation for person re-identification and face verification , 2012, BMVC.

[9]  Suh-Yin Lee,et al.  Car model recognition by utilizing symmetric property to overcome severe pose variation , 2012, Machine Vision and Applications.

[10]  Bingpeng Ma,et al.  Local Descriptors Encoded by Fisher Vectors for Person Re-identification , 2012, ECCV Workshops.

[11]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[12]  Larry S. Davis,et al.  Jointly Optimizing 3D Model Fitting and Fine-Grained Classification , 2014, ECCV.

[13]  Huadong Ma,et al.  Vehicle Retrieval and Trajectory Inference in Urban Traffic Surveillance Scene , 2014, ICDSC.

[14]  Guillaume Charpiat,et al.  Multiple Object Tracking by Efficient Graph Partitioning , 2014, ACCV.

[15]  Nir Ailon,et al.  Deep Metric Learning Using Triplet Network , 2014, SIMBAD.

[16]  Xiaoou Tang,et al.  A large-scale car dataset for fine-grained categorization and verification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Qi Wang,et al.  Exploiting effects of parts in fine-grained categorization of vehicles , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[18]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[21]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Tao Mei,et al.  A Deep Learning-Based Approach to Progressive Vehicle Re-identification for Urban Surveillance , 2016, ECCV.

[23]  Qingming Huang,et al.  Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks , 2015, ECCV.

[24]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Adam Herout,et al.  Vehicle Re-identification for Automatic Video Traffic Surveillance , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[26]  Tiejun Huang,et al.  Deep Relative Distance Learning: Tell the Difference between Similar Vehicles , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Martín Abadi,et al.  TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[28]  Xiaogang Wang,et al.  Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Wu Liu,et al.  Large-scale vehicle re-identification in urban surveillance videos , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).

[30]  K. R. Ramakrishnan,et al.  Deep automatic license plate recognition system , 2016, ICVGIP '16.

[31]  Shaogang Gong,et al.  Learning a Discriminative Null Space for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Manohar Paluri,et al.  Metric Learning with Adaptive Density Discrimination , 2015, ICLR.

[33]  Adam Herout,et al.  BoxCars: 3D Boxes as CNN Input for Improved Fine-Grained Vehicle Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Wei Zeng,et al.  Exploiting Multi-grain Ranking Constraints for Precisely Searching Visually-similar Vehicles , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  S. Gong,et al.  KANACI, ZHU, GONG: VEHICLE RE-ID BY FINE-GRAINED CROSS-LEVEL DEEP LEARNING1 Vehicle Re-Identification by Fine-Grained Cross-Level Deep Learning , 2017 .

[36]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[37]  Vittorio Murino,et al.  Exploiting Feature Correlations by Brownian Statistics for People Detection and Recognition , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[38]  Bernt Schiele,et al.  Multiple People Tracking by Lifted Multicut and Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Xiaogang Wang,et al.  Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[40]  Xiaogang Wang,et al.  Learning Deep Neural Networks for Vehicle Re-ID with Visual-spatio-Temporal Path Proposals , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[41]  Pavel Zemcík,et al.  Holistic recognition of low quality license plates by CNN using track annotated data , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[42]  Huibing Wang,et al.  Deep CNNs With Spatially Weighted Pooling for Fine-Grained Car Recognition , 2017, IEEE Transactions on Intelligent Transportation Systems.

[43]  David C. Anastasiu,et al.  The NVIDIA AI City Challenge , 2017, 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI).

[44]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[45]  Alexander J. Smola,et al.  Sampling Matters in Deep Embedding Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[46]  Jiri Matas,et al.  Working hard to know your neighbor's margins: Local descriptor learning loss , 2017, NIPS.

[47]  Ling-Yu Duan,et al.  Group-Sensitive Triplet Embedding for Vehicle Reidentification , 2018, IEEE Transactions on Multimedia.

[48]  Ling Shao,et al.  Viewpoint-Aware Attentive Multi-view Inference for Vehicle Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49]  Carlo Tomasi,et al.  Features for Multi-target Multi-camera Tracking and Re-identification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50]  Alex Bewley,et al.  Deep Cosine Metric Learning for Person Re-identification , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[51]  Hanqing Lu,et al.  Learning Coarse-to-Fine Structured Feature Embedding for Vehicle Re-Identification , 2018, AAAI.

[52]  Ling Shao,et al.  Vehicle Re-Identification by Adversarial Bi-Directional LSTM Network , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[53]  Jenq-Neng Hwang,et al.  CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).