Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios

In real video surveillance scenarios, visual pedestrian attributes, such as gender, backpack, clothes types, are very important for pedestrian retrieval and person reidentification. Existing methods for attributes recognition have two drawbacks: (a) handcrafted features (e.g. color histograms, local binary patterns) cannot cope well with the difficulty of real video surveillance scenarios; (b) the relationship among pedestrian attributes is ignored. To address the two drawbacks, we propose two deep learning based models to recognize pedestrian attributes. On the one hand, each attribute is treated as an independent component and the deep learning based single attribute recognition model (DeepSAR) is proposed to recognize each attribute one by one. On the other hand, to exploit the relationship among attributes, the deep learning framework which recognizes multiple attributes jointly (DeepMAR) is proposed. In the DeepMAR, one attribute can contribute to the representation of other attributes. For example, the gender of woman can contribute to the representation oflong hair and wearing skirt. Experiments on recent popular pedestrian attribute datasets illustrate that our proposed models achieve the state-of-the-art results.

[1]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Shree K. Nayar,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence Describable Visual Attributes for Face Verification and Image Search , 2022 .

[5]  Shaogang Gong,et al.  Person Re-identification by Attributes , 2012, BMVC.

[6]  Xiaoou Tang,et al.  Pedestrian Attribute Recognition At Far Distance , 2014, ACM Multimedia.

[7]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Xiaogang Wang,et al.  Deeply learned face representations are sparse, selective, and robust , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Shaogang Gong,et al.  Person Re-Identification , 2014 .

[10]  Rogério Schmidt Feris,et al.  Attribute-based people search in surveillance environments , 2009, 2009 Workshop on Applications of Computer Vision (WACV).

[11]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[12]  Larry S. Davis,et al.  Image ranking and retrieval based on multi-attribute queries , 2011, CVPR 2011.

[13]  Xiaoou Tang,et al.  Learning to Recognize Pedestrian Attribute , 2015, ArXiv.

[14]  Shengcai Liao,et al.  Pedestrian Attribute Classification in Surveillance: Database and Evaluation , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[15]  Kun Duan,et al.  Discovering localized attributes for fine-grained recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Tao Xiang,et al.  Video Analytics for Business Intelligence , 2012, Studies in Computational Intelligence.

[17]  Trevor Darrell,et al.  PANDA: Pose Aligned Networks for Deep Attribute Modeling , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Shuicheng Yan,et al.  Clothing Attributes Assisted Person Reidentification , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[19]  Xiao Liu,et al.  Attribute-restricted latent topic model for person re-identification , 2012, Pattern Recognit..

[20]  Andrew Zisserman,et al.  Learning Visual Attributes , 2007, NIPS.

[21]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.