论文信息 - Learning Fairness in Multi-Agent Systems

Learning Fairness in Multi-Agent Systems

Fairness is essential for human society, contributing to stability and productivity. Similarly, fairness is also the key for many multi-agent systems. Taking fairness into multi-agent learning could help multi-agent systems become both efficient and stable. However, learning efficiency and fairness simultaneously is a complex, multi-objective, joint-policy optimization. To tackle these difficulties, we propose FEN, a novel hierarchical reinforcement learning model. We first decompose fairness for each agent and propose fair-efficient reward that each agent learns its own policy to optimize. To avoid multi-objective conflict, we design a hierarchy consisting of a controller and several sub-policies, where the controller maximizes the fair-efficient reward by switching among the sub-policies that provides diverse behaviors to interact with the environment. FEN can be trained in a fully decentralized way, making it easy to be deployed in real-world applications. Empirically, we show that FEN easily learns both fairness and efficiency and significantly outperforms baselines in a variety of multi-agent scenarios.

Zongqing Lu | Jiechuan Jiang | Zongqing Lu | Jiechuan Jiang

[1] Stephen P. Boyd,et al. Distributed average consensus with least-mean-square deviation , 2007, J. Parallel Distributed Comput..

[2] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.

[3] Bernd Freisleben,et al. Virtual Machine Resource Allocation in Cloud Computing via Multi-Agent Fuzzy Control , 2013, 2013 International Conference on Cloud and Green Computing.

[4] Song Zuo,et al. The Matthew Effect in Computation Contests: High Difficulty May Lead to 51% Dominance? , 2019, WWW.

[5] Preeti Ranjan Panda,et al. Cooperative Multi-Agent Reinforcement Learning-Based Co-optimization of Cores, Caches, and On-chip Network , 2017, ACM Trans. Archit. Code Optim..

[6] Zongqing Lu,et al. Learning Attentional Communication for Multi-Agent Cooperation , 2018, NeurIPS.

[7] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.

[8] Raj Jain,et al. A Quantitative Measure Of Fairness And Discrimination For Resource Allocation In Shared Computer Systems , 1998, ArXiv.

[9] Nico Roos,et al. Considerations for fairness in multi-agent systems , 2007 .

[10] M. Stanković. Multi-agent reinforcement learning , 2016 .

[11] Alexander Peysakhovich,et al. Prosocial Learning Agents Solve Generalized Stag Hunts Better than Selfish Ones Extended Abstract , 2018 .

[12] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.

[13] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.

[14] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.

[15] Matjaz Perc,et al. The Matthew effect in empirical data , 2014, Journal of The Royal Society Interface.

[16] Nicolas Maudet,et al. Fairness in Multiagent Resource Allocation with Dynamic and Partial Observations , 2018, AAMAS.

[17] Julie A. Shah,et al. Fairness in Multi-Agent Sequential Decision-Making , 2014, NIPS.

[18] A. van de Rijt,et al. The Matthew effect in science funding , 2018, Proceedings of the National Academy of Sciences.

[19] Ming Zhou,et al. Mean Field Multi-Agent Reinforcement Learning , 2018, ICML.

[20] Tiejun Huang,et al. Graph Convolutional Reinforcement Learning , 2020, ICLR.

[21] Joel Z. Leibo,et al. Inequity aversion improves cooperation in intertemporal social dilemmas , 2018, NeurIPS.

[22] Ariel D. Procaccia,et al. Truth, justice, and cake cutting , 2010, Games Econ. Behav..

[23] Tie-Yan Liu,et al. A Cooperative Multi-Agent Reinforcement Learning Framework for Resource Balancing in Complex Logistics Network , 2019, AAMAS.

[24] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[25] Ariel D. Procaccia,et al. No agent left behind: dynamic fair division of multiple resources , 2013, AAMAS.

[26] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[27] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[28] Zongqing Lu,et al. Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation , 2018, ArXiv.

[29] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.

[30] Arumugam Nallanathan,et al. Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks , 2018, IEEE Transactions on Wireless Communications.

[31] Joel Z. Leibo,et al. Evolving intrinsic motivations for altruistic behavior , 2018, AAMAS.

[32] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[33] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[34] Yichuan Jiang,et al. The Rich Get Richer: Preferential Attachment in the Task Allocation of Cooperative Networked Multiagent Systems With Resource Caching , 2012, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[35] H. Francis Song,et al. Machine Theory of Mind , 2018, ICML.

[36] Ariel D. Procaccia. Thou Shalt Covet Thy Neighbor's Cake , 2009, IJCAI.

[37] Shimon Whiteson,et al. Traffic Light Control by Multiagent Reinforcement Learning Systems , 2010, Interactive Collaborative Information Systems.

[38] Zhang-Wei Hong,et al. A Deep Policy Inference Q-Network for Multi-Agent Systems , 2017, AAMAS.

[39] Shimon Whiteson,et al. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.