Learning Group Communication from Demonstration

We consider the design of a communication policy for multi-group multi-agent communication, which takes as input the state of the world (e.g., history of communication, gaze direction, body pose of others) and outputs an optimal communication mode (e.g., speaking, listening, responding) for appropriate social interaction. A key component of our communication policy design is a communication gating module, termed the KinesicProxemic-Message Gate (KPM-Gate), that automatically infers group membership so that the actions generated by the communication policy depend only on the relevant group members. We pose the communication policy learning problem as a multiagent imitation learning problem and we learn a single shared policy across all agents under the assumption of a decentralized Markov decision process. We term our entire policy network as the Multi-Agent Group Discovery and Communication Mode Network (MAGDAM network) as it learns social group structure as well as the dynamics of group communication. Our experimental validation on both synthetic and real world data shows that our model is able to discover social group structure in addition to learning an accurate multi-agent communication policy.

[1]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[2]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[3]  Victor R. Lesser,et al.  Decentralized Markov decision processes with event-driven interactions , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[4]  Daniel Gatica-Perez,et al.  Analyzing Group Interactions in Conversations: a Review , 2006, 2006 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems.

[5]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[6]  Elisa Ricci,et al.  Space speaks: towards socially and personality aware visual surveillance , 2010, MPVA '10.

[7]  Ben J. A. Kröse,et al.  Detecting F-formations as dominant sets , 2011, ICMI '11.

[8]  Alessio Del Bue,et al.  Social interaction discovery by statistical analysis of F-formations , 2011, BMVC.

[9]  Francesco Setti,et al.  Multi-scale f-formation discovery for group detection , 2013, 2013 IEEE International Conference on Image Processing.

[10]  Chris Russell,et al.  Correction: F-Formation Detection: Individuating Free-Standing Conversational Groups in Images , 2015, PloS one.

[11]  Scott E. Hudson,et al.  Parallel detection of conversational groups of free-standing people and tracking of their lower-body orientation , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Shimon Whiteson,et al.  Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[14]  Rob Fergus,et al.  Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[15]  Silvio Savarese,et al.  Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes , 2016, ECCV.

[16]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Yisong Yue,et al.  Coordinated Multi-Agent Imitation Learning , 2017, ICML.

[18]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[19]  Philip H. S. Torr,et al.  DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jun Wang,et al.  Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games , 2017, ArXiv.

[21]  Shimon Whiteson,et al.  Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[22]  Pieter Abbeel,et al.  Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.