Combining Propositional Logic Based Decision Diagrams with Decision Making in Urban Systems

Solving multiagent problems can be an uphill task due to uncertainty in the environment, partial observability, and scalability of the problem at hand. Especially in an urban setting, there are more challenges since we also need to maintain safety for all users while minimizing congestion of the agents as well as their travel times. To this end, we tackle the problem of multiagent pathfinding under uncertainty and partial observability where the agents are tasked to move from their starting points to ending points while also satisfying some constraints, e.g., low congestion, and model it as a multiagent reinforcement learning problem. We compile the domain constraints using propositional logic and integrate them with the RL algorithms to enable fast simulation for RL.

[1]  Jonathan P. How,et al.  Modeling and Planning with Macro-Actions in Decentralized POMDPs , 2019, J. Artif. Intell. Res..

[2]  Adnan Darwiche,et al.  Tractable Operations for Arithmetic Circuits of Probabilistic Models , 2016, NIPS.

[3]  Adnan Darwiche,et al.  Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence SDD: A New Canonical Representation of Propositional Knowledge Bases , 2022 .

[4]  Shimon Whiteson,et al.  Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[5]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[6]  Corina S. Pasareanu,et al.  Scheduling and Airport Taxiway Path Planning Under Uncertainty , 2019 .

[7]  Le Song,et al.  2 Common Formulation for Greedy Algorithms on Graphs , 2018 .

[8]  Sven Koenig,et al.  Task and Path Planning for Multi-Agent Pickup and Delivery , 2019, AAMAS.

[9]  Chung-Shou Liao,et al.  The Covering Canadian Traveller Problem , 2014, Theor. Comput. Sci..

[10]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[11]  Takeru Inoue,et al.  Graphillion: software library for very large sets of labeled graphs , 2014, International Journal on Software Tools for Technology Transfer.

[12]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[13]  Neil Immerman,et al.  The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.

[14]  Victor R. Lesser,et al.  Decentralized Markov decision processes with event-driven interactions , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[15]  Shih-Fen Cheng,et al.  Decision Support for Agent Populations in Uncertain and Congested Environments , 2012, AAAI.

[16]  Shimon Whiteson,et al.  QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning , 2018, ICML.

[17]  Masaaki Nagata,et al.  Compiling Graph Substructures into Sentential Decision Diagrams , 2017, AAAI.

[18]  Sven Koenig,et al.  Multi-Agent Path Finding with Delay Probabilities , 2016, AAAI.

[19]  Frans A. Oliehoek,et al.  A Concise Introduction to Decentralized POMDPs , 2016, SpringerBriefs in Intelligent Systems.

[20]  Adnan Darwiche,et al.  Structured Features in Naive Bayes Classification , 2016, AAAI.

[21]  Dan Suciu,et al.  Recent Trends in Knowledge Compilation (Dagstuhl Seminar 17381) , 2017, Dagstuhl Reports.

[22]  Claudia V. Goldman,et al.  Solving Transition Independent Decentralized Markov Decision Processes , 2004, J. Artif. Intell. Res..

[23]  Howie Choset,et al.  PRIMAL: Pathfinding via Reinforcement and Imitation Multi-Agent Learning , 2018, IEEE Robotics and Automation Letters.

[24]  Akshat Kumar,et al.  Reinforcement Learning for Zone Based Multiagent Pathfinding under Uncertainty , 2020, ICAPS.

[25]  Hoong Chuin Lau,et al.  Collective Multiagent Sequential Decision Making Under Uncertainty , 2017, AAAI.

[26]  Makoto Yokoo,et al.  Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.

[27]  Samy Bengio,et al.  Neural Combinatorial Optimization with Reinforcement Learning , 2016, ICLR.

[28]  Masaaki Nagata,et al.  Zero-Suppressed Sentential Decision Diagrams , 2016, AAAI.

[29]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[30]  Adnan Darwiche,et al.  Structured Bayesian Networks: From Inference to Learning with Routes , 2019, AAAI.

[31]  Umut Oztok,et al.  A Top-Down Compiler for Sentential Decision Diagrams , 2015, IJCAI.

[32]  Steven M. LaValle,et al.  Structure and Intractability of Optimal Multi-Robot Path Planning on Graphs , 2013, AAAI.

[33]  Peter Stone,et al.  Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[34]  Jianye Hao,et al.  Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces , 2019, IJCAI.

[35]  Guy Van den Broeck,et al.  Probabilistic Sentential Decision Diagrams , 2014, KR.

[36]  Adnan Darwiche,et al.  Tractability in Structured Probability Spaces , 2017, NIPS.

[37]  Kee-Eung Kim,et al.  Learning to Cooperate via Policy Search , 2000, UAI.