Visual attribute detction for pedestrian detection

Attributes are expected to narrow down the semantic gap between low-level visual features and high-level semantic meanings. Such superiority motivates us to explore pedestrian attributes which has became a critical problem to boost image understanding and improve the performance of pedestrian detection, retrieval, re-identification, etc. Based on the PETA dateset, we manually relabel two subset VIPeR and PRID as our experimental dataset. Moreover, we proposed an evaluation protocol for researchers to evaluate pedestrian attribute classification algorithms. In this paper, we utilized two baseline methods to to demonstrate the performance of the attribute in pedestrian detection. The first one directly uses color and texture features to train Support Vector Machine (SVM) classification while the other one uses DSIFT (Dense SIFT) with Bag-of-Visual-Words (BoVW) to train SVM classification. Finally, we report and discuss the baseline performance on the database following the proposed evaluation protocol.

[1]  Yunjun Gao,et al.  Rare Category Exploration on Linear Time Complexity , 2015, DASFAA.

[2]  Shuicheng Yan,et al.  Clothing Attributes Assisted Person Reidentification , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[4]  Xiao Liu,et al.  Spatial graphlet matching kernel for recognizing aerial image categories , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[5]  Shaogang Gong,et al.  Attribute Learning for Understanding Unstructured Social Activity , 2012, ECCV.

[6]  Cordelia Schmid,et al.  Constructing models for content-based image retrieval , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[7]  Nicu Sebe,et al.  Collaborative Sparse Coding for Multiview Action Recognition , 2016, IEEE MultiMedia.

[8]  Silvio Savarese,et al.  Recognizing human actions by attributes , 2011, CVPR 2011.

[9]  Huizhong Chen,et al.  Describing Clothing by Semantic Attributes , 2012, ECCV.

[10]  C. Schmid,et al.  Object Class Recognition Using Discriminative Local Features , 2005 .

[11]  P. Duygulu,et al.  Visual categorization with bags of keypoints , 2002, eccv 2002.

[12]  Xuelong Li,et al.  A Fine-Grained Image Categorization System by Cellet-Encoded Spatial Pyramid Modeling , 2015, IEEE Transactions on Industrial Electronics.

[13]  Hao Huang,et al.  Prior-free rare category detection: More effective and efficient solutions , 2014, Expert Syst. Appl..

[14]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[15]  Larry S. Davis,et al.  Image ranking and retrieval based on multi-attribute queries , 2011, CVPR 2011.

[16]  Subhransu Maji,et al.  Describing people: A poselet-based approach to attribute classification , 2011, 2011 International Conference on Computer Vision.

[17]  Gang Chen,et al.  Color Image Analysis by Quaternion-Type Moments , 2014, Journal of Mathematical Imaging and Vision.

[18]  Luc Van Gool,et al.  Apparel Classification with Style , 2012, ACCV.

[19]  Bin Gu,et al.  A Robust Regularization Path Algorithm for $\nu $ -Support Vector Classification , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[20]  Subhransu Maji,et al.  Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Rogério Schmidt Feris,et al.  Attribute-based people search in surveillance environments , 2009, 2009 Workshop on Applications of Computer Vision (WACV).

[22]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[23]  Kun Duan,et al.  Discovering localized attributes for fine-grained recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Song-Chun Zhu,et al.  Weakly Supervised Learning for Attribute Localization in Outdoor Scenes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  ZissermanAndrew,et al.  Scene Classification Using a Hybrid Generative/Discriminative Approach , 2008 .

[26]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[27]  Xiao Liu,et al.  Fast multi-view segment graph kernel for object classification , 2013, Signal Process..

[28]  Xuelong Li,et al.  Detecting Densely Distributed Graph Patterns for Fine-Grained Image Categorization , 2016, IEEE Transactions on Image Processing.

[29]  Jitendra Malik,et al.  Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[30]  Luming Zhang,et al.  Interest Inference via Structure-Constrained Multi-Source Multi-Task Learning , 2015, IJCAI.

[31]  Xiaoou Tang,et al.  Pedestrian Attribute Recognition At Far Distance , 2014, ACM Multimedia.

[32]  Yi Yang,et al.  Discovering Discriminative Graphlets for Aerial Image Categories Recognition , 2013, IEEE Transactions on Image Processing.

[33]  Ming Yang,et al.  Real-time clothing recognition in surveillance videos , 2011, 2011 18th IEEE International Conference on Image Processing.

[34]  Ling Shao,et al.  A rapid learning algorithm for vehicle classification , 2015, Inf. Sci..

[35]  Andrew Zisserman,et al.  Scene Classification Using a Hybrid Generative/Discriminative Approach , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Shaogang Gong,et al.  Person Re-identification by Attributes , 2012, BMVC.

[37]  Yuhui Zheng,et al.  Image segmentation by generalized hierarchical fuzzy C-means algorithm , 2015, J. Intell. Fuzzy Syst..

[38]  Yi-Liang Zhao,et al.  Volunteerism Tendency Prediction via Harvesting Multiple Social Networks , 2016, ACM Trans. Inf. Syst..

[39]  Andrew Zisserman,et al.  Learning Visual Attributes , 2007, NIPS.

[40]  Subhransu Maji,et al.  Detecting People Using Mutually Consistent Poselet Activations , 2010, ECCV.

[41]  D. Sagi,et al.  Gabor filters as texture discriminator , 1989, Biological Cybernetics.

[42]  Xuelong Li,et al.  Large-Scale Aerial Image Categorization Using a Multitask Topological Codebook , 2016, IEEE Transactions on Cybernetics.

[43]  Tat-Seng Chua,et al.  Micro Tells Macro: Predicting the Popularity of Micro-Videos via a Transductive Model , 2016, ACM Multimedia.

[44]  Matti Pietikäinen,et al.  A comparative study of texture measures with classification based on featured distributions , 1996, Pattern Recognit..