Improving human body part detection using deep learning and motion consistency

Body part segmentation and detection in videos is a useful analysis for many computer vision tasks such as action recognition and video search. Conventional methods mainly focus on body part detection assuming upright posture of the human body. Recently, a body part detection framework was proposed to include non-upright postures. This method consists of 2 parts, initial segmentation and computation of body part likelihood score for each segment. In this paper, we propose improvements to this approach. Firstly, we propose a novel motion based body part segmentation using kinematic features to identify segments which undergo similar motion in the video based on a consistency or error measure. Secondly, we replace the Extreme Learning Machine classifier in the original work with deep learning to investigate it's performance. For accurate detection, deep learning requires a lot of training data and it has so far been used only in high resolution images. Here we apply deep learning for body part detection in low resolution cases. We conduct experiments to study and analyse the effect of the improvements proposed.

[1]  Peter V. Gehler,et al.  Poselet Conditioned Pictorial Structures , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Dong Liang,et al.  A 3D object recognition and pose estimation system using deep learning method , 2014, 2014 4th IEEE International Conference on Information Science and Technology.

[3]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Hao Jiang,et al.  Human pose estimation using consistent max-covering , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[5]  Mircea Nicolescu,et al.  Human Body Parts Tracking Using Torso Tracking: Applications to Activity Recognition , 2012, 2012 Ninth International Conference on Information Technology - New Generations.

[6]  Mubarak Shah,et al.  Human Action Recognition in Videos Using Kinematic Features and Multiple Instance Learning , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Michael J. Black,et al.  From Pictorial Structures to deformable structures , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Jian Dong,et al.  Deep Human Parsing with Active Template Regression , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Ramakant Nevatia,et al.  Tracking of Multiple, Partially Occluded Humans based on Static Body Part Detection , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Etienne Corvée,et al.  Body Parts Detection for People Tracking Using Trees of Histogram of Oriented Gradient Descriptors , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[11]  Katerina Fragkiadaki,et al.  Pose from Flow and Flow from Pose , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[13]  Zhidong Deng,et al.  A robust pedestrian detection approach based on shapelet feature and Haar detector ensembles , 2012 .

[14]  Jitendra Malik,et al.  Poselets: Body part detectors trained using 3D human pose annotations , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[15]  Hao Jiang,et al.  Human movement summarization and depiction from videos , 2013, 2013 IEEE International Conference on Multimedia and Expo (ICME).

[16]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Mark Everingham,et al.  Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation , 2010, BMVC.

[18]  Bernt Schiele,et al.  2D Human Pose Estimation: New Benchmark and State of the Art Analysis , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Xiaoqing Ding,et al.  A novel hierarchical framework for human head-shoulder detection , 2011, 2011 4th International Congress on Image and Signal Processing.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[21]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[23]  Wei-Yun Yau,et al.  Human body part detection using likelihood score computations , 2014, 2014 IEEE Symposium on Computational Intelligence in Biometrics and Identity Management (CIBIM).

[24]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[25]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[26]  Xiaolong Wang,et al.  Discriminative Deep Belief Networks for image classification , 2010, 2010 IEEE International Conference on Image Processing.