Person Retrieval in Surveillance Video using Height, Color and Gender

A person is commonly described by attributes like height, build, cloth color, cloth type, and gender. Such attributes are known as soft biometrics. They bridge the semantic gap between human description and person retrieval in surveillance video. The paper proposes a deep learning-based linear filtering approach for person retrieval using height, cloth color, and gender. The proposed approach uses Mask R-CNN for pixel-wise person segmentation. It removes background clutter and provides precise boundary around the person. Color and gender models are fine-tuned using AlexNet and the algorithm is tested on SoftBioSearch dataset. It achieves good accuracy for person retrieval using the semantic query in challenging conditions.

[1]  Kaiqi Huang,et al.  Multi-attribute learning for pedestrian attribute recognition in surveillance scenarios , 2015, 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR).

[2]  Sridha Sridharan,et al.  Locating People in Video from Semantic Descriptions: A New Database and Approach , 2014, 2014 22nd International Conference on Pattern Recognition.

[3]  Shengcai Liao,et al.  Multi-label CNN based pedestrian attribute learning for soft biometrics , 2015, 2015 International Conference on Biometrics (ICB).

[4]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[5]  Sridha Sridharan,et al.  Searching for people using semantic soft biometric descriptions , 2015, Pattern Recognit. Lett..

[6]  Alessandro Perina,et al.  Multiple-Shot Person Re-identification by HPE Signature , 2010, 2010 20th International Conference on Pattern Recognition.

[7]  Sanjay Chaudhary,et al.  Description Based Person Identification: Use of Clothes Color and Type , 2017, NCVPRIPG.

[8]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[9]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[10]  Roger Y. Tsai,et al.  A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses , 1987, IEEE J. Robotics Autom..

[11]  Arun Ross,et al.  What Else Does Your Biometric Data Reveal? A Survey on Soft Biometrics , 2016, IEEE Transactions on Information Forensics and Security.

[12]  Anil K. Jain,et al.  ViSE: Visual Search Engine Using Multiple Networked Cameras , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[13]  J. Shepherd,et al.  Adult Eyewitness Testimony: Whole body information: Its relevance to eyewitnesses , 1994 .

[14]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[15]  Anil K. Jain,et al.  Soft Biometric Traits for Personal Recognition Systems , 2004, ICBA.

[16]  Liang Lin,et al.  Human Re-identification by Matching Compositional Template with Cluster Sampling , 2013, 2013 IEEE International Conference on Computer Vision.

[17]  Sridha Sridharan,et al.  Can You Describe Him for Me? A Technique for Semantic Person Search in Video , 2012, 2012 International Conference on Digital Image Computing Techniques and Applications (DICTA).

[18]  Bastian Leibe,et al.  Person Attribute Recognition with a Jointly-Trained Holistic CNN Model , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[19]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.