Detecting Mutual Awareness Events

It is quite common that multiple human observers attend to a single static interest point. This is known as a mutual awareness event (MAWE). A preferred way to monitor these situations is with a camera that captures the human observers while using existing face detection and head pose estimation algorithms. The current work studies the underlying geometric constraints of MAWEs and reformulates them in terms of image measurements. The constraints are then used in a method that 1) detects whether such an interest point does exist, 2) determines where it is located, 3) identifies who was attending to it, and 4) reports where and when each observer was while attending to it. The method is also applied on another interesting event when a single moving human observer fixates on a single static interest point. The method can deal with the general case of an uncalibrated camera in a general environment. This is in contrast to other work on similar problems that inherently assumes a known environment or a calibrated camera. The method was tested on about 75 images from various scenes and robustly detects MAWEs and estimates their related attributes. Most of the images were found by searching the Internet.

[1]  Andrew Zisserman,et al.  MLESAC: A New Robust Estimator with Application to Estimating Image Geometry , 2000, Comput. Vis. Image Underst..

[2]  Robert M. Haralick,et al.  Propagating covariance in computer vision , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[3]  Maja Pantic,et al.  Social signal processing: Survey of an emerging domain , 2009, Image Vis. Comput..

[4]  Neil A. Dodgson,et al.  Variation and extrema of human interpupillary distance , 2004, IS&T/SPIE Electronic Imaging.

[5]  H. Opower Multiple view geometry in computer vision , 2002 .

[6]  Jeffrey B. Mulligan,et al.  Model-based head pose estimation for air-traffic controllers , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[7]  Kenichi Kanatani,et al.  Triangulation from Two Views Revisited: Hartley-Sturm vs. Optimal Correction , 2008, BMVC.

[8]  Junji Yamato,et al.  A probabilistic inference of multiparty-conversation structure based on Markov-switching models of gaze patterns, head directions, and utterances , 2005, ICMI '05.

[9]  Larry S. Davis,et al.  Person identification using automatic height and stride estimation , 2002, Object recognition supported by user interaction for service robots.

[10]  Ian D. Reid,et al.  Guiding Visual Surveillance by Tracking Human Attention , 2009, BMVC.

[11]  Patrick Olivier,et al.  Visual Focus of Attention Recognition in the Ambient Kitchen , 2009, ACCV.

[12]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[13]  Vittorio Murino,et al.  Social interactions by visual focus of attention in a three‐dimensional environment , 2013, Expert Syst. J. Knowl. Eng..

[14]  Michael C. Frank,et al.  Development of infants’ attention to faces during the first year , 2009, Cognition.

[15]  Jean-Marc Odobez,et al.  Multiperson Visual Focus of Attention from Head Pose and Meeting Contextual Cues , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Alexander H. Waibel,et al.  From Gaze to Focus of Attention , 1999, VISUAL.

[17]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[18]  L. Moisan,et al.  Maximal meaningful events and applications to image analysis , 2003 .

[19]  Yann LeCun,et al.  Synergistic Face Detection and Pose Estimation with Energy-Based Models , 2004, J. Mach. Learn. Res..

[20]  Jean-Marc Odobez,et al.  A Study on Visual Focus of Attention Recognition from Head Pose in a Meeting Room , 2006, MLMI.

[21]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[22]  Mohan M. Trivedi,et al.  Head Pose Estimation in Computer Vision: A Survey , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Lionel Moisan,et al.  Meaningful Alignments , 2000, International Journal of Computer Vision.

[24]  Jean-Marc Odobez,et al.  Tracking the Visual Focus of Attention for a Varying Number of Wandering People , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Tsuhan Chen,et al.  Jointly estimating demographics and height with a calibrated camera , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[26]  N. Emery,et al.  The eyes have it: the neuroethology, function and evolution of social gaze , 2000, Neuroscience & Biobehavioral Reviews.

[27]  J. Crowley,et al.  Estimating Face orientation from Robust Detection of Salient Facial Structures , 2004 .

[28]  Jean-Marc Odobez,et al.  Multi-party focus of attention recognition in meetings from head pose and multimodal contextual cues , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[29]  Yan Wang,et al.  Real-Time Multi-View Face Detection and Pose Estimation in Video Stream , 2006, 18th International Conference on Pattern Recognition (ICPR'06).