Real-time skeleton tracking for embedded systems

Touch-free gesture technology is beginning to become more popular with consumers and may have a significant future impact on interfaces for digital photography. However, almost every commercial software framework for gesture and pose detection is aimed at either desktop PCs or high-powered GPUs, making mobile implementations for gesture recognition an attractive area for research and development. In this paper we present an algorithm for hand skeleton tracking and gesture recognition that runs on an ARM-based platform (Pandaboard ES, OMAP 4460 architecture). The algorithm uses self-organizing maps to fit a given topology (skeleton) into a 3D point cloud. This is a novel way of approaching the problem of pose recognition as it does not employ complex optimization techniques or data-based learning. After an initial background segmentation step, the algorithm is ran in parallel with heuristics, which detect and correct artifacts arising from insufficient or erroneous input data. We then optimize the algorithm for the ARM platform using fixed-point computation and the NEON SIMD architecture the OMAP4460 provides. We tested the algorithm with two different depth-sensing devices (Microsoft Kinect, PMD Camboard). For both input devices we were able to accurately track the skeleton at the native framerate of the cameras.

[1]  Mircea Nicolescu,et al.  Vision-based hand pose estimation: A review , 2007, Comput. Vis. Image Underst..

[2]  Jean-Claude Martin,et al.  Gesture and emotion: Can basic gestural form features discriminate emotions? , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[3]  Yoshiaki Shirai,et al.  Real-time 3D hand posture estimation based on 2D appearance retrieval using monocular camera , 2001, Proceedings IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems.

[4]  Michael Riis Andersen,et al.  Kinect Depth Sensor Evaluation for Computer Vision Applications , 2012 .

[5]  Marc M. Sebrechts,et al.  HANDBOOK OF VIRTUAL ENVIRONMENTS , 2014 .

[6]  日向 俊二 Kinect for Windowsアプリを作ろう , 2012 .

[7]  M. Bhuiyan Gesture-controlled user interfaces , what have we done and what ’ s next ? , 2009 .

[8]  Achim Ebert,et al.  GPU Accelerated Gesture Detection for Real Time Interaction , 2007, Visualization of Large and Unstructured Data Sets.

[9]  Kazuo Kyuma,et al.  Computer vision for computer games , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[10]  Reinhard Koch,et al.  Time‐of‐Flight Cameras in Computer Graphics , 2010, Comput. Graph. Forum.

[11]  Vladimir Pavlovic,et al.  Special issue on vision for human-computer interaction , 2007, Comput. Vis. Image Underst..

[12]  Greg Mori,et al.  Real-time Motion-based Gesture Recognition Using the GPU , 2009, MVA.

[13]  Eric Foxlin,et al.  Motion Tracking Requirements and Technologies , 2002 .

[14]  Thomas Martinetz,et al.  Self-Organizing Maps for Pose Estimation with a Time-of-Flight Camera , 2009, Dyn3D.

[15]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.