Towards Kinect-based wheelchair user behavior analysis

In this thesis we present some techniques which can help to estimate the level of attention of a person sitting on a wheelchair in real time. We focus on person detection, head pose estimation and facial features localization, using depth data given by the Kinect, a low-quality consumer depth camera. Our solution to detect a person in the image is based on the flood fill algorithm and requires no initialization. We estimate the head pose adopting the discriminative random regression forests approach and we filter the solution in time domain, achieving robust results. Finally, we try to extract the 3D locations of a set of chosen facial points using random regression forests, which have been trained from a dataset acquired with the Kinect sensor.

[1]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[2]  Toby Sharp,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR.

[3]  Lijun Yin,et al.  Automatic pose estimation of 3D facial models , 2008, 2008 19th International Conference on Pattern Recognition.

[4]  Luc Van Gool,et al.  Real Time Head Pose Estimation from Consumer Depth Cameras , 2011, DAGM-Symposium.

[5]  Luc Van Gool,et al.  Real time head pose estimation with random regression forests , 2011, CVPR 2011.

[6]  Luc Van Gool,et al.  Real-time face pose estimation from single range images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Kikuo Fujimura,et al.  Constrained Optimization for Human Pose Estimation from Depth Sequences , 2007, ACCV.

[8]  Sebastian Thrun,et al.  Real-time identification and localization of body parts from depth images , 2010, 2010 IEEE International Conference on Robotics and Automation.