Refinement of Ontology-Constrained Human Pose Classification

In this paper, we propose an image classification method that recognizes several poses of idol photographs. The proposed method takes unannotated idol photos as input, and classifies them according to their poses based on spatial layouts of the idol in the photos. Our method has two phases, the first one is to estimate the spatial layout of ten body parts (head, torso, upper and lower arms and legs) using Eichner's Stickman Pose Estimation. The second one is to classify the poses of the idols using Bayesian Network classifiers. In order to improve accuracy of the classification, we introduce Pose Guide Ontology (PGO). PGO contains useful background knowledge, such as semantic hierarchies and constraints related to the positional relationship between the body parts. The location information of body parts is amended by PGO. We also propose iterative procedures for making further refinements of PGO. Finally, we evaluated our method on a dataset consisting of 400 images in 8 poses, and the final results indicated that F-measure of the classification has become 15% higher than non-amended results.

[1]  Simon J. D. Prince,et al.  Computer Vision: Models, Learning, and Inference , 2012 .

[2]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Snehasis Mukherjee,et al.  Recognizing interaction between human performers using 'key pose doublet' , 2011, ACM Multimedia.

[4]  Andrew Zisserman,et al.  2D Articulated Human Pose Estimation and Retrieval in (Almost) Unconstrained Still Images , 2012, International Journal of Computer Vision.

[5]  Martin A. Fischler,et al.  The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[6]  Richard Szeliski,et al.  Computer Vision - Algorithms and Applications , 2011, Texts in Computer Science.

[7]  Alexander Gepperth,et al.  Real-time pedestrian detection and pose classification on a GPU , 2013, 16th International IEEE Conference on Intelligent Transportation Systems (ITSC 2013).

[8]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[9]  Marie-Pierre Jolly,et al.  Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[10]  Andrew Zisserman,et al.  Pose search: Retrieving people using their pose , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Harpreet S. Sawhney,et al.  Vehicle identification between non-overlapping cameras without direct feature matching , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[12]  Simon J. D. Prince,et al.  Computer Vision: Index , 2012 .

[13]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.

[14]  Marie-Pierre Jolly,et al.  Interactive Graph Cuts for Optimal Boundary and Region Segmentation of Objects in N-D Images , 2001, ICCV.

[15]  Viktor K. Prasanna,et al.  Understanding web images by object relation network , 2012, WWW.

[16]  Bernt Schiele,et al.  Pictorial structures revisited: People detection and articulated pose estimation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Andrew Blake,et al.  "GrabCut" , 2004, ACM Trans. Graph..