Multiview social behavior analysis in work environments

In this paper, we propose an approach that fuses information from a network of visual sensors for the analysis of human social behavior. A discriminative interaction classifier is trained based on the relative head orientation and distance between a pair of people. Specifically, we explore human interaction detection at different levels of feature fusion and decision fusion. While feature fusion mitigates local errors and improves feature accuracy, decision fusion at higher levels significantly reduces the amount of information to be shared among cameras. Experiment results show that our proposed method achieves promising performance on a challenging dataset. By distributing the computation over multiple smart cameras, our approach is not only robust but also scalable.

[1]  Ramakant Nevatia,et al.  Segmentation and Tracking of Multiple Humans in Crowded Environments , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Shaogang Gong,et al.  Support vector machine based multi-view face detection and recognition , 2004, Image Vis. Comput..

[3]  Bernhard Schölkopf,et al.  Kernel machine based learning for multi-view face detection and pose estimation , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[4]  Gunilla Borgefors,et al.  Hierarchical Chamfer Matching: A Parametric Edge Matching Algorithm , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Bernt Schiele,et al.  Pedestrian detection in crowded scenes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Xiaofeng Ren,et al.  Finding people in archive films through tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Edward T. Hall,et al.  A System for the Notation of Proxemic Behavior1 , 1963 .

[9]  Chen Wu,et al.  Real-Time Human Posture Reconstruction in Wireless Smart Camera Networks , 2008, 2008 International Conference on Information Processing in Sensor Networks (ipsn 2008).

[10]  Mubarak Shah,et al.  A Multiview Approach to Tracking People in Crowded Scenes Using a Planar Homography Constraint , 2006, ECCV.

[11]  Horst Bischof,et al.  Supervised local subspace learning for continuous head pose estimation , 2011, CVPR 2011.

[12]  Masatsugu Kidode,et al.  Human interaction analysis based on walking pattern transitions , 2009, 2009 Third ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC).

[13]  Ian D. Reid,et al.  Colour Invariant Head Pose Classification in Low Resolution Video , 2008, BMVC.

[14]  David Beymer,et al.  Face recognition under varying pose , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Chen Wu,et al.  Multiview activity recognition in smart homes with spatio-temporal features , 2010, ICDSC '10.

[16]  Vincent Lepetit,et al.  Fast Keypoint Recognition in Ten Lines of Code , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Rainer Stiefelhagen,et al.  A Bayesian Approach for Multi-view Head Pose Estimation , 2006, 2006 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems.

[18]  Chen Wu,et al.  Discovering social interactions in real work environments , 2011, Face and Gesture 2011.

[19]  Sethuraman Panchanathan,et al.  Biased Manifold Embedding: A Framework for Person-Independent Head Pose Estimation , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Rainer Stiefelhagen,et al.  Towards vision-based 3-D people tracking in a smart room , 2002, Proceedings. Fourth IEEE International Conference on Multimodal Interfaces.

[21]  Jie Zhu,et al.  Head orientation and gaze direction in meetings , 2002, CHI Extended Abstracts.

[22]  Ting Yu,et al.  Monitoring, recognizing and discovering social networks , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Mubarak Shah,et al.  Monitoring human behavior from video taken in an office environment , 2001, Image Vis. Comput..

[24]  Mubarak Shah,et al.  Determining driver visual attention with one camera , 2003, IEEE Trans. Intell. Transp. Syst..

[25]  Ramakant Nevatia,et al.  Stochastic human segmentation from a static camera , 2002, Workshop on Motion and Video Computing, 2002. Proceedings..