Posture recognition with a top-view camera

We describe a system that recognizes human postures with heavy self-occlusion. In particular, we address posture recognition in a robot assisted-living scenario, where the environment is equipped with a top-view camera for monitoring human activities. This setup is very useful because top-view cameras lead to accurate localization and limited inter-occlusion between persons, but conversely they suffer from body parts being frequently self-occluded. The conventional way of posture recognition relies on good estimation of body part positions, which turns out to be unstable in the top-view due to occlusion and foreshortening. In our approach, we learn a posture descriptor for each specific posture category. The posture descriptor encodes how well the person in the image can be `explained' by the model. The postures are subsequently recognized from the matching scores returned by the posture descriptors. We select the state-of-the-art approach of pose estimation as our posture descriptor. The results show that our method is able to correctly classify 79.7% of the test sample, which outperforms the conventional approach by over 23%.

[1]  Fei-Fei Li,et al.  Modeling mutual context of object and human pose in human-object interaction activities , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Qing Lin,et al.  Human Behavior Understanding via Top-View Vision , 2012 .

[3]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Luc Van Gool,et al.  Does Human Action Recognition Benefit from Pose Estimation? , 2011, BMVC.

[5]  Luc Van Gool,et al.  A Hough transform-based voting framework for action recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Thorsten Joachims,et al.  Training structural SVMs when exact inference is intractable , 2008, ICML '08.

[7]  Leonid Sigal Human Pose Estimation , 2014, Computer Vision, A Reference Guide.

[8]  Juergen Gall,et al.  Class-specific Hough forests for object detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Pinar Duygulu Sahin,et al.  Histogram of oriented rectangles: A new pose descriptor for human action recognition , 2009, Image Vis. Comput..

[10]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[11]  Makoto Mizukawa,et al.  Human behavior recognition via top-view vision for intelligent space , 2010, ICCAS 2010.

[12]  François Brémond,et al.  Monitoring Activities of Daily Living (ADLs) of Elderly Based on 3D Key Human Postures , 2009, ICVW.

[13]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[14]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[15]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[16]  Li-Chen Fu,et al.  On-line human action recognition by combining joint tracking and key pose recognition , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  Yi Yang,et al.  Articulated pose estimation with flexible mixtures-of-parts , 2011, CVPR 2011.

[18]  Juan Pablo Wachs,et al.  Recognizing Human Postures and Poses in Monocular Still Images , 2009, IPCV.

[19]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[20]  Alexei A. Efros,et al.  People Watching: Human Actions as a Cue for Single View Geometry , 2012, International Journal of Computer Vision.

[21]  Moritz Tenorth,et al.  The TUM Kitchen Data Set of everyday manipulation activities for motion tracking and action recognition , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.