Improving triplet-wise training of convolutional neural network for vehicle re-identification

Vehicle re-identification (re-id) plays an important role in the automatic analysis of the drastically increasing urban surveillance videos. Similar to the other image retrieval problems, vehicle re-id suffers from the difficulties caused by various poses of vehicles, diversified illuminations, and complicated environments. Triplet-wise training of convolutional neural network (CNN) has been studied to address these challenges, where the CNN is adopted to automate the feature extraction from images, and the training adopts triplets of (query, positive example, negative example) to capture the relative similarity between them to learn representative features. The traditional triplet-wise training is weakly constrained and thus fails to achieve satisfactory results. We propose to improve the triplet-wise training at two aspects: first, a stronger constraint namely classification-oriented loss is augmented with the original triplet loss; second, a new triplet sampling method based on pairwise images is designed. Our experimental results demonstrate the effectiveness of the proposed methods that achieve superior performance than the state-of-the-arts on two vehicle re-id datasets, which are derived from real-world urban surveillance videos.

[1]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[2]  Xiaogang Wang,et al.  DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[4]  Xiaogang Wang,et al.  Unsupervised Salience Learning for Person Re-identification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Wu Liu,et al.  Large-scale vehicle re-identification in urban surveillance videos , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).

[6]  Xiaoou Tang,et al.  A large-scale car dataset for fine-grained categorization and verification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Tao Mei,et al.  A Deep Learning-Based Approach to Progressive Vehicle Re-identification for Urban Surveillance , 2016, ECCV.

[8]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Dong Liu,et al.  Multi-Scale Triplet CNN for Person Re-Identification , 2016, ACM Multimedia.

[10]  P. Varaiya,et al.  Practical Scheme for Arterial Travel Time Estimation Based on Vehicle Reidentification Using Wireless Sensors , 2009 .

[11]  Xiaogang Wang,et al.  Learning Mid-level Filters for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Michael Jones,et al.  An improved deep learning architecture for person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[14]  Guodong Wang,et al.  Effects of hot rolled microstructure after twin-roll casting on microstructure, texture and magnetic properties of low silicon non-oriented electrical steel , 2016 .

[15]  Daoqin Tong,et al.  Vehicle Re-Identification With Dynamic Time Windows for Vehicle Passage Time Estimation , 2011, IEEE Transactions on Intelligent Transportation Systems.

[16]  Xiaogang Wang,et al.  Locally Aligned Feature Transforms across Views , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Hai Tao,et al.  Evaluating Appearance Models for Recognition, Reacquisition, and Tracking , 2007 .

[18]  Tiejun Huang,et al.  Deep Relative Distance Learning: Tell the Difference between Similar Vehicles , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).