Tracking gaze direction from far-field surveillance cameras

We present a real-time approach to estimating the gaze direction of multiple individuals using a network of far-field surveillance cameras. This work is part of a larger surveillance system that utilizes a network of fixed cameras as well as PTZ cameras to perform site-wide tracking of individuals. Based on the tracking information, one or more PTZ cameras are cooperatively controlled to obtain close-up facial images of individuals. Within these close-up shots, face detection and head pose estimation are performed and the results are provided back to the tracking system to track the individual gazes. A new cost metric based on location and gaze orientation is proposed to robustly associate head observations with tracker states. The tracking system can thus leverage the newly obtained gaze information for two purposes: (i) improve the localization of individuals in crowded settings, and (ii) aid high-level surveillance tasks such as understanding gesturing, interactions between individuals, and finding the object-of-interest that people are looking at. In security application, our system can detect if a subject is looking at the security cameras or guard posts.

[1]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Xiaoming Liu,et al.  Video-based face model fitting using Adaptive Active Appearance Model , 2010, Image Vis. Comput..

[3]  Ying Wu,et al.  Distributed data association and filtering for multiple target tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Ting Yu,et al.  What are customers looking at? , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[5]  Montse Pardàs,et al.  Head Orientation Estimation Using Particle Filtering in Multiview Scenarios , 2007, CLEAR.

[6]  Ting Yu,et al.  Group Level Activity Recognition in Crowded Environments across Multiple Cameras , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[7]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[8]  Ting Yu,et al.  Monitoring, recognizing and discovering social networks , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Rainer Stiefelhagen,et al.  Head Pose Estimation in Single- and Multi-view Environments - Results on the CLEAR'07 Benchmarks , 2007, CLEAR.

[10]  Xiaoming Liu,et al.  Discriminative Face Alignment , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Henry Schneiderman,et al.  Learning a restricted Bayesian network for object detection , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[12]  Ting Yu,et al.  Collaborative Real-Time Control of Active Cameras in Large Scale Surveillance Systems , 2008 .

[13]  Rainer Stiefelhagen,et al.  Multi-pose Face Recognition for Person Retrieval in Camera Networks , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.

[14]  Roberto Brunelli,et al.  Joint Bayesian Tracking of Head Location and Pose from Low-Resolution Video , 2007, CLEAR.

[15]  Andreas E. Savakis,et al.  Facial pose estimation using a symmetrical feature model , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[16]  Mohan M. Trivedi,et al.  Robust real-time detection, tracking, and pose estimation of faces in video streams , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[17]  Jing Xiao,et al.  Fitting a Single Active Appearance Model Simultaneously to Multiple Images , 2004, BMVC.

[18]  Thomas B. Moeslund,et al.  Pose Estimation of Interacting People using Pictorial Structures , 2010, 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance.