A Game-Theoretic Probabilistic Approach for Detecting Conversational Groups

A standing conversational group (also known as F-formation) occurs when two or more people sustain a social interaction, such as chatting at a cocktail party. Detecting such interactions in images or videos is of fundamental importance in many contexts, like surveillance, social signal processing, social robotics or activity classification. This paper presents an approach to this problem by modeling the socio-psychological concept of an F-formation and the biological constraints of social attention. Essentially, an F-formation defines some constraints on how subjects have to be mutually located and oriented while the biological constraints defines the plausible zone in which persons can interact. We develop a game-theoretic framework embedding these constraints, which is supported by a statistical modeling of the uncertainty associated with the position and orientation of people. First, we use a novel representation of the affinity between pairs of people expressed as a distance between distributions over the most plausible oriented region of attention.Additionally, we integrate temporal information over multiple frames to smooth noisy head orientation and pose estimates, solve ambiguous situations and establish a more precise social context. We do this in a principled way by using recent notions from multi-payoff evolutionary game theory. Experiments on several benchmark datasets consistently show the superiority of the proposed approach over state of the art and its robustness under severe noise conditions.

[1]  D. Blackwell An analog of the minimax theorem for vector payoffs. , 1956 .

[2]  L. Shapley,et al.  Equilibrium points in games with vector payoffs , 1959 .

[3]  E. Goffman Behavior in public places : notes on the social organization of gatherings , 1964 .

[4]  E. Hall,et al.  The Hidden Dimension , 1970 .

[5]  A. Kendon Some functions of gaze-direction in social interaction. , 1967, Acta psychologica.

[6]  R. E. Blofeld Theory of Games: Techniques and Applications , 1968 .

[7]  S. Duncan,et al.  Some Signals and Rules for Taking Speaking Turns in Conversations , 1972 .

[8]  M. Zeleny Games with multiple payoffs , 1975 .

[9]  T. M. Ciolek,et al.  Environment and the Spatial Arrangement of Conversational Encounters , 1980 .

[10]  S. Moore Here''s Looking at You Kid , 1988 .

[11]  A. Kendon Conducting Interaction: Patterns of Behavior in Focused Encounters , 1990 .

[12]  Jörgen W. Weibull,et al.  Evolutionary Game Theory , 1996 .

[13]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[14]  Azriel Rosenfeld,et al.  Tracking Groups of People , 2000, Comput. Vis. Image Underst..

[15]  Monique Thonnat,et al.  Tracking Groups of People for Video Surveillance , 2002 .

[16]  Jorge S. Marques,et al.  Tracking Groups of Pedestrians in Video Sequences , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[17]  Rieks op den Akker,et al.  Towards Automatic Addressee Identification in Multi-party Dialogues , 2004, SIGDIAL Workshop.

[18]  Matthias Ehrgott,et al.  Multicriteria Optimization , 2005 .

[19]  Andrea Torsello,et al.  Grouping with Asymmetric Affinities: A Game-Theoretic Perspective , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[20]  Anders Green,et al.  Investigating Spatial Relationships in Human-Robot Interaction , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Marcello Pelillo,et al.  Dominant Sets and Pairwise Clustering , 2007 .

[22]  Dima Damen,et al.  Detecting Carried Objects in Short Video Sequences , 2008, ECCV.

[23]  Jean-Marc Odobez,et al.  Tracking the Visual Focus of Attention for a Varying Number of Wandering People , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  John S. Baras,et al.  Achieving symmetric Pareto Nash equilibria using biased replicator dynamics , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[25]  Ting Yu,et al.  Monitoring, recognizing and discovering social networks , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Luc Van Gool,et al.  You'll never walk alone: Modeling social behavior for multi-target tracking , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[27]  Reginald B. Adams,et al.  The science of social vision , 2010 .

[28]  Luc Van Gool,et al.  Improving Data Association by Joint Modeling of Pedestrian Trajectories and Groupings , 2010, ECCV.

[29]  Georg Groh,et al.  Detecting Social Situations from Interaction Geometry , 2010, 2010 IEEE Second International Conference on Social Computing.

[30]  C. Cannings,et al.  Evolutionary Game Theory , 2010 .

[31]  Subramanian Ramanathan,et al.  Putting the pieces together: multimodal analysis of social attention in meetings , 2010, ACM Multimedia.

[32]  Elisa Ricci,et al.  Space speaks: towards socially and personality aware visual surveillance , 2010, MPVA '10.

[33]  Andrew Zisserman,et al.  "Here's looking at you, kid". Detecting people looking at each other in videos , 2011, BMVC.

[34]  Ben J. A. Kröse,et al.  Detecting F-formations as dominant sets , 2011, ICMI '11.

[35]  Alessio Del Bue,et al.  Social interaction discovery by statistical analysis of F-formations , 2011, BMVC.

[36]  Jean-Marc Odobez,et al.  Multiperson Visual Focus of Attention from Head Pose and Meeting Contextual Cues , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Ming-Ching Chang,et al.  Probabilistic group-level motion analysis and scenario recognition , 2011, 2011 International Conference on Computer Vision.

[38]  Luis E. Ortiz,et al.  Who are you with and where are you going? , 2011, CVPR 2011.

[39]  Bodo Rosenhahn,et al.  Everybody needs somebody: Modeling social and grouping behavior on a linear programming multiple people tracker , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[40]  Zhen Qin,et al.  Improving multi-target tracking via social grouping , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Jean-Marc Odobez,et al.  We are not contortionists: Coupled adaptive learning for head and body orientation estimation in surveillance video , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Yang Wang,et al.  Discriminative Latent Models for Recognizing Contextual Group Activities , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Robert T. Collins,et al.  Vision-Based Analysis of Small Groups in Pedestrian Crowds , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Mohan S. Kankanhalli,et al.  Temporal encoded F-formation system for social interaction detection , 2013, ACM Multimedia.

[45]  Francesco Setti,et al.  Group detection in still images by F-formation modeling: A comparative study , 2013, 2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS).

[46]  Toshimitsu Ushio,et al.  Evolutionarily and Neutrally Stable Strategies in Multicriteria Games , 2013, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[47]  Peter Carr,et al.  Hybrid robotic/virtual pan-tilt-zom cameras for autonomous event recording , 2013, ACM Multimedia.

[48]  Marcello Pelillo,et al.  A Game-Theoretic Approach to Hypergraph Clustering , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  James L. Crowley,et al.  Head Pose Estimation Using Multi-scale Gaussian Derivatives , 2013, SCIA.

[50]  Francesco Setti,et al.  Multi-scale f-formation discovery for group detection , 2013, 2013 IEEE International Conference on Image Processing.

[51]  Andrew Zisserman,et al.  Detecting People Looking at Each Other in Videos , 2014, International Journal of Computer Vision.

[52]  Ruonan Li,et al.  Finding Group Interactions in Social Clutter , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Ioannis A. Kakadiaris,et al.  Activity analysis in crowded environments using social cues for group discovery and human interaction modeling , 2014, Pattern Recognit. Lett..