Differential Privacy in Cooperative Multiagent Planning

Privacy-aware multiagent systems must protect agents' sensitive data while simultaneously ensuring that agents accomplish their shared objectives. Towards this goal, we propose a framework to privatize inter-agent communications in cooperative multiagent decision-making problems. We study sequential decision-making problems formulated as cooperative Markov games with reach-avoid objectives. We apply a differential privacy mechanism to privatize agents' communicated symbolic state trajectories, and then we analyze tradeoffs between the strength of privacy and the team's performance. For a given level of privacy, this tradeoff is shown to depend critically upon the total correlation among agents' state-action processes. We synthesize policies that are robust to privacy by reducing the value of the total correlation. Numerical experiments demonstrate that the team's performance under these policies decreases by only 3 percent when comparing private versus non-private implementations of communication. By contrast, the team's performance decreases by roughly 86 percent when using baseline policies that ignore total correlation and only optimize team performance.

[1]  Yu-Xiang Wang,et al.  Offline Reinforcement Learning with Differential Privacy , 2022, ArXiv.

[2]  Guy Shani,et al.  Privacy preserving planning in multi-agent stochastic environments , 2022, Auton. Agents Multi Agent Syst..

[3]  Mykel J. Kochenderfer,et al.  Scalable Online Planning for Multi-Agent MDPs , 2022, J. Artif. Intell. Res..

[4]  Austin Jones,et al.  Differential Privacy for Symbolic Systems with Application to Markov Chains , 2022, Autom..

[5]  Ufuk Topcu,et al.  Planning Not to Talk: Multiagent Systems that are Robust to Communication Loss , 2022, AAMAS.

[6]  Mykel J. Kochenderfer,et al.  Scalable Anytime Planning for Multi-Agent MDPs , 2021, AAMAS.

[7]  Vianney Perchet,et al.  Local Differential Privacy for Regret Minimization in Reinforcement Learning , 2020, NeurIPS.

[8]  Philip S. Yu,et al.  Differentially Private Multi-Agent Planning for Logistic-Like Problems , 2020, IEEE Transactions on Dependable and Secure Computing.

[9]  Michal Stolba,et al.  Privacy leakage of search-based multi-agent planning algorithms , 2019, Autonomous Agents and Multi-Agent Systems.

[10]  Yung Yi,et al.  QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning , 2019, ICML.

[11]  Henrik Sandberg,et al.  Fisher Information as a Measure of Privacy: Preserving Privacy of Households With Smart Meters Using Batteries , 2018, IEEE Transactions on Smart Grid.

[12]  Shimon Whiteson,et al.  QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.

[13]  Michal Stolba,et al.  Quantifying Privacy Leakage in Multi-Agent Planning , 2018, ACM Trans. Internet Techn..

[14]  A. Komenda,et al.  Privacy-concerned multiagent planning , 2016, Knowledge and Information Systems.

[15]  Frans A. Oliehoek,et al.  A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.

[16]  Ronen I. Brafman,et al.  A Privacy Preserving Algorithm for Multi-Agent Planning and Search , 2015, IJCAI.

[17]  Ronen I. Brafman,et al.  Distributed Heuristic Forward Search for Multi-agent Planning , 2014, J. Artif. Intell. Res..

[18]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[19]  Michael Wooldridge,et al.  Autonomous agents and multi-agent systems , 2014 .

[20]  Andreas Haeberlen,et al.  Differential Privacy: An Economic Method for Choosing Epsilon , 2014, 2014 IEEE 27th Computer Security Foundations Symposium.

[21]  Jose M. Such,et al.  A survey of privacy in multi-agent systems , 2013, The Knowledge Engineering Review.

[22]  Martin J. Wainwright,et al.  Local privacy and statistical minimax rates , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[23]  Wenwu Yu,et al.  An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination , 2012, IEEE Transactions on Industrial Informatics.

[24]  Gert R. G. Lanckriet,et al.  On the Convergence of the Concave-Convex Procedure , 2009, NIPS.

[25]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[26]  Nathalie Jacobs Springer , 2006 .

[27]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[28]  Claudia V. Goldman,et al.  Transition-independent decentralized markov decision processes , 2003, AAMAS '03.

[29]  Shobha Venkataraman,et al.  Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..

[30]  Klaus U. Schulz,et al.  Fast string correction with Levenshtein automata , 2002, International Journal on Document Analysis and Recognition.

[31]  Alan L. Yuille,et al.  The Concave-Convex Procedure (CCCP) , 2001, NIPS.

[32]  L. Schmetterer Zeitschrift fur Wahrscheinlichkeitstheorie und Verwandte Gebiete. , 1963 .

[33]  Ufuk Topcu,et al.  Differential Privacy on the Unit Simplex via the Dirichlet Mechanism , 2021, IEEE Transactions on Information Forensics and Security.

[34]  Gaurav S. Sukhatme,et al.  Multiple Mobile Robot Systems , 2016, Springer Handbook of Robotics, 2nd Ed..

[35]  International Foundation for Autonomous Agents and MultiAgent Systems ( IFAAMAS ) , 2007 .

[36]  IEEE Transactions on Industrial Informatics , 2005 .

[37]  John wiley & sons. , 1994, Environmental science & technology.

[38]  J. Bretagnolle,et al.  Estimation des densités: risque minimax , 1978 .

[39]  International Journal on Document Analysis and Recognition (IJDAR) manuscript No. (will be inserted by the editor) Generation of Synthetic Documents for Performance Evaluation of Symbol Recognition & Spotting Systems , 2022 .