Gaze and body pose estimation from a distance

We present a comprehensive approach to track gaze by estimating location, body pose, and head pose direction of multiple individuals in unconstrained environments. The approach combines person detections from fixed cameras with directional face detections obtained from actively controlled pan tilt zoom (PTZ) cameras. The main contribution of this work is to estimate both body pose and head pose (gaze) direction independently from motion direction, using a combination of sequential Monte Carlo Filtering and MCMC sampling. There are numerous benefits in tracking body pose and gaze in surveillance. It allows to track people's focus of attention, can optimize the control of active cameras for biometric face capture, and can provide better interaction metrics between pairs of people. The availability of gaze and face detection information also improves localization and data association for tracking in crowded environments. The performance of the system will be demonstrated on data captured at a real-time surveillance site.

[1]  Ming-Ching Chang,et al.  Tracking gaze direction from far-field surveillance cameras , 2011, 2011 IEEE Workshop on Applications of Computer Vision (WACV).

[2]  U. Soergel,et al.  REAL-TIME ORIENTATION OF A PTZ-CAMERA BASED ON PEDESTRIAN DETECTION IN VIDEO DATA OF WIDE AND COMPLEX SCENES , 2008 .

[3]  Ian D. Reid,et al.  Estimating Gaze Direction from Low-Resolution Faces in Video , 2006, ECCV.

[4]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[5]  Alexander H. Waibel,et al.  From Gaze to Focus of Attention , 1999, VISUAL.

[6]  Kiyoharu Aizawa,et al.  Tracking of humans and estimation of body/head orientation from top-view single camera for visual focus of attention analysis , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[7]  Ting Yu,et al.  Monitoring, recognizing and discovering social networks , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Hirotake Yamazoe,et al.  Remote gaze estimation with a single camera based on facial-feature tracking without special calibration actions , 2008, ETRA.

[9]  Jean-Marc Odobez,et al.  Tracking the Visual Focus of Attention for a Varying Number of Wandering People , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[11]  Henry Schneiderman,et al.  Learning a restricted Bayesian network for object detection , 2004, CVPR 2004.

[12]  Ting Yu,et al.  Group Level Activity Recognition in Crowded Environments across Multiple Cameras , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[13]  James L. Crowley,et al.  Head Pose Estimation on Low Resolution Images , 2006, CLEAR.

[14]  Ting Yu,et al.  Collaborative Real-Time Control of Active Cameras in Large Scale Surveillance Systems , 2008 .

[15]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Rainer Stiefelhagen,et al.  Multi-pose Face Recognition for Person Retrieval in Camera Networks , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.