Pedestrian Attribute Recognition with Part-based CNN and Combined Feature Representations

In video surveillance, pedestrian attributes such as gender, clothing or hair types are useful cues to identify people. The main challenge in pedestrian attribute recognition is the large variation of visual appearance and location of attributes due to different poses and camera views. In this paper, we propose a neural network combining high-level learnt Convolutional Neural Network (CNN) features and low-level handcrafted features to address the problem of highly varying viewpoints. We first extract low-level robust Local Maximal Occurrence (LOMO) features and learn a body part-specific CNN to model attribute patterns related to different body parts. For small datasets which have few data, we propose a new learning strategy, where the CNN is pre-trained in a triplet structure on a person re-identification task and then fine-tuned on attribute recognition. Finally, we fuse the two feature representations to recognise pedestrian attributes. Our approach achieves state-of-the-art results on three public pedestrian attribute datasets.

[1]  Yiqiang Chen,et al.  Triplet CNN and pedestrian attribute recognition for improved person re-identification , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[2]  Huchuan Lu,et al.  CNN for saliency detection with low-level feature integration , 2017, Neurocomputing.

[3]  Shengcai Liao,et al.  Multi-label convolutional neural network based pedestrian attribute classification , 2017, Image Vis. Comput..

[4]  Gang Wang,et al.  Gated Siamese Convolutional Neural Network Architecture for Human Re-identification , 2016, ECCV.

[5]  Xiang Li,et al.  An enhanced deep feature representation for person re-identification , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[6]  Kaiqi Huang,et al.  Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[7]  Liang Lin,et al.  Deep feature learning with relative distance comparison for person re-identification , 2015, Pattern Recognit..

[8]  Shengcai Liao,et al.  Multi-label CNN based pedestrian attribute learning for soft biometrics , 2015, 2015 International Conference on Biometrics (ICB).

[9]  Xiaogang Wang,et al.  Pedestrian detection aided by deep learning semantic tasks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Xiaoou Tang,et al.  Pedestrian Attribute Recognition At Far Distance , 2014, ACM Multimedia.

[11]  Shengcai Liao,et al.  Improve Pedestrian Attribute Classification by Weighted Interactions from Other Attributes , 2014, ACCV Workshops.

[12]  Xiaogang Wang,et al.  DeepReID: Deep Filter Pairing Neural Network for Person Re-identification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Shengcai Liao,et al.  Person re-identification by Local Maximal Occurrence representation and metric learning , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  James Philbin,et al.  Learning Fine-Grained Image Similarity with Deep Ranking , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Shengcai Liao,et al.  Pedestrian Attribute Classification in Surveillance: Database and Evaluation , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[16]  G. Lefebvre,et al.  Learning a bag of features based nonlinear metric for facial similarity , 2013, 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  Kun Duan,et al.  Discovering localized attributes for fine-grained recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Silvio Savarese,et al.  Recognizing human actions by attributes , 2011, CVPR 2011.

[20]  Matti Pietikäinen,et al.  Modeling pixel process with scale invariant local patterns for background subtraction in complex scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Rogério Schmidt Feris,et al.  Attribute-based people search in surveillance environments , 2009, 2009 Workshop on Applications of Computer Vision (WACV).

[22]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[23]  Shaogang Gong,et al.  Attributes-Based Re-identification , 2014, Person Re-Identification.

[24]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[25]  Shaogang Gong,et al.  Person Re-identification by Attributes , 2012, BMVC.

[26]  Hai Tao,et al.  Evaluating Appearance Models for Recognition, Reacquisition, and Tracking , 2007 .