Explicit Attention Modeling for Pedestrian Attribute Recognition

Recent studies on pedestrian attribute recognition have achieved significant improvements by utilizing complex networks and attention mechanisms. However, most of these studies learn the attention map implicitly through the class activation map. In this paper, we propose an explicit attention modeling approach for pedestrian attribute recognition. We construct a mask branch to learn the attention maps with a lightweight feature pyramid network. The features inside the specific mask are then averaged to obtain the scores for attribute recognition. Additionally, we introduce spatial and semantic distillation to improve the consistency of attention masks and attribute scores. Our experiments demonstrate that the proposed explicit attention modeling can achieve state-of-the-art performance on PA100K, PETA, and PAR datasets with negligible parameters.

[1]  Hao Guo,et al.  Visual Attention Consistency for Human Attribute Recognition , 2022, International Journal of Computer Vision.

[2]  Kaiqi Huang,et al.  Spatial and Semantic Consistency Regularizations for Pedestrian Attribute Recognition , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[3]  Kaiqi Huang,et al.  Rethinking of Pedestrian Attribute Recognition: Realistic Datasets with Efficient Method , 2020, 2005.11909.

[4]  Wei Wu,et al.  Hierarchical Feature Embedding for Attribute Recognition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Lu Sheng,et al.  Improving Pedestrian Attribute Recognition With Weakly-Supervised Multi-Scale Attribute-Specific Localization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Hao Guo,et al.  Visual Attention Consistency Under Image Transforms for Multi-Label Image Classification , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Kaiqi Huang,et al.  A Richly Annotated Pedestrian Dataset for Person Retrieval in Real Surveillance Scenarios , 2019, IEEE Transactions on Image Processing.

[8]  Junjie Yan,et al.  Localization Guided Learning for Pedestrian Attribute Recognition , 2018, BMVC.

[9]  Ioannis A. Kakadiaris,et al.  Deep Imbalanced Attribute Classification using Visual Attention Aggregation , 2018, ECCV.

[10]  Kaiqi Huang,et al.  Pose Guided Deep Model for Pedestrian Attribute Recognition in Surveillance Scenarios , 2018, 2018 IEEE International Conference on Multimedia and Expo (ICME).

[11]  Xiaogang Wang,et al.  HydraPlus-Net: Attentive Deep Features for Pedestrian Analysis , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Shaogang Gong,et al.  Attribute Recognition by Joint Recurrent Learning of Context and Correlation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Serge J. Belongie,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Kaiqi Huang,et al.  Weakly-supervised Learning of Mid-level Features for Pedestrian Attribute Recognition and Localization , 2016, BMVC.

[15]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Kaiqi Huang,et al.  Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[18]  Max Jaderberg,et al.  Spatial Transformer Networks , 2015, NIPS.

[19]  Xiaoou Tang,et al.  Pedestrian Attribute Recognition At Far Distance , 2014, ACM Multimedia.

[20]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[21]  Shengcai Liao,et al.  Pedestrian Attribute Classification in Surveillance: Database and Evaluation , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[22]  S. Hochreiter,et al.  Long Short-Term Memory , 1997, Neural Computation.