Viewpoint-Aware Attentive Multi-view Inference for Vehicle Re-identification

Vehicle re-identification (re-ID) has the huge potential to contribute to the intelligent video surveillance. However, it suffers from challenges that different vehicle identities with a similar appearance have little inter-instance discrepancy while one vehicle usually has large intra-instance differences under viewpoint and illumination variations. Previous methods address vehicle re-ID by simply using visual features from originally captured views and usually exploit the spatial-temporal information of the vehicles to refine the results. In this paper, we propose a Viewpoint-aware Attentive Multi-view Inference (VAMI) model that only requires visual information to solve the multi-view vehicle reID problem. Given vehicle images of arbitrary viewpoints, the VAMI extracts the single-view feature for each input image and aims to transform the features into a global multiview feature representation so that pairwise distance metric learning can be better optimized in such a viewpointinvariant feature space. The VAMI adopts a viewpoint-aware attention model to select core regions at different viewpoints and implement effective multi-view feature inference by an adversarial training architecture. Extensive experiments validate the effectiveness of each proposed component and illustrate that our approach achieves consistent improvements over state-of-the-art vehicle re-ID methods on two public datasets: VeRi and VehicleID.

[1]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Qi Tian,et al.  Person Re-identification in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Ling Shao,et al.  Learning Cross-View Binary Identities for Fast Person Re-Identification , 2016, IJCAI.

[4]  Xiaoou Tang,et al.  A large-scale car dataset for fine-grained categorization and verification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Tao Mei,et al.  A Deep Learning-Based Approach to Progressive Vehicle Re-identification for Urban Surveillance , 2016, ECCV.

[6]  Jiahuan Zhou,et al.  Material for Efficient Online Local Metric Adaptation via Negative Samples for Person Re-Identification , 2017 .

[7]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Ling Shao,et al.  Hetero-Manifold Regularisation for Cross-Modal Hashing , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Wei-Shi Zheng,et al.  Cross-View Asymmetric Metric Learning for Unsupervised Person Re-Identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Kaiqi Huang,et al.  Beyond Triplet Loss: A Deep Quadruplet Network for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Kewei Tu,et al.  Structured Attentions for Visual Question Answering , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[13]  Soumith Chintala,et al.  Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[14]  Cordelia Schmid,et al.  Areas of Attention for Image Captioning , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Xiaogang Wang,et al.  Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[17]  Jiwen Lu,et al.  Attention-Aware Deep Reinforcement Learning for Video Face Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Shaogang Gong,et al.  Learning a Discriminative Null Space for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[20]  Tiejun Huang,et al.  Deep Relative Distance Learning: Tell the Difference between Similar Vehicles , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Ling Shao,et al.  Fast Person Re-identification via Cross-Camera Semantic Binary Transformation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Wu Liu,et al.  Large-scale vehicle re-identification in urban surveillance videos , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).

[24]  Jiwen Lu,et al.  Consistent-Aware Deep Learning for Person Re-identification in a Camera Network , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Ling Shao,et al.  Cross-View GAN Based Vehicle Generation for Re-identification , 2017, BMVC.

[26]  Tao Mei,et al.  Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Xiaogang Wang,et al.  Residual Attention Network for Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Xiaogang Wang,et al.  Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Tao Mei,et al.  Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Xiaogang Wang,et al.  Learning Deep Neural Networks for Vehicle Re-ID with Visual-spatio-Temporal Path Proposals , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[31]  T. Xiang,et al.  Supplementary of Multi-scale Deep Learning Architectures for Person Re-identification , 2017 .

[32]  Jian-Huang Lai,et al.  RGB-Infrared Cross-Modality Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[33]  Jingdong Wang,et al.  Deeply-Learned Part-Aligned Representations for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[35]  Liang Zheng,et al.  Re-ranking Person Re-identification with k-Reciprocal Encoding , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Shiliang Zhang,et al.  Pose-Driven Deep Convolutional Model for Person Re-identification , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[37]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[38]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[39]  Xiaogang Wang,et al.  Spindle Net: Person Re-identification with Human Body Region Guided Feature Decomposition and Fusion , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Nanning Zheng,et al.  Point to Set Similarity Based Deep Feature Learning for Person Re-Identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Kaiqi Huang,et al.  Learning Deep Context-Aware Features over Body and Latent Parts for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.