Collaborative multiagent reinforcement learning schemes for air traffic management

In this work we investigate the use of hierarchical collaborative reinforcement learning methods (H-CMARL) for the computation of joint policies to resolve congestion problems in the Air Traffic Management (ATM) domain. In particular, to address cases where the demand of airspace use exceeds capacity, we consider agents representing flights, who need to decide jointly on ground delays at the pre-tactical stage of operations, towards executing their trajectories while adhering to airspace capacity constraints. In doing so, agents collaborate, applying collaborative multi-agent reinforcement learning methods. Specifically, starting from a multiagent Markov Decision Process problem formulation, we introduce a flat and a hierarchical collaborative multiagent reinforcement learning method at two levels (the ground and an abstract one). To quantitatively assess the quality of solutions of the proposed approaches and show the potential of the hierarchical method in resolving the demand-capacity balance problems, we provide experimental results on real-world evaluation cases, where we measure the average delay of flights and the number of flights with delays.

[1]  Kagan Tumer,et al.  A multiagent approach to managing air traffic flow , 2010, Autonomous Agents and Multi-Agent Systems.

[2]  Kagan Tumer,et al.  Aligning social welfare and agent preferences to alleviate traffic congestion , 2008, AAMAS.

[3]  Ana L. C. Bazzan,et al.  Agents in Traffic Modelling - From Reactive to Social Behaviour , 1999, KI.

[4]  Andreas S. Schulz,et al.  Network flow problems and congestion games: complexity and approximation results , 2006 .

[5]  George A. Vouros,et al.  Multiagent Reinforcement Learning Methods to Resolve Demand Capacity Balance Problems , 2018, SETN.

[6]  Nikos A. Vlassis,et al.  Collaborative Multiagent Reinforcement Learning by Payoff Propagation , 2006, J. Mach. Learn. Res..

[7]  Peter Vrancx,et al.  Analysing Congestion Problems in Multi-agent Reinforcement Learning , 2017, AAMAS.

[8]  Thomas J. Walsh,et al.  Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.

[9]  Joshua B. Tenenbaum,et al.  Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[10]  Moshe Tennenholtz,et al.  Congestion games with failures , 2005, EC '05.

[11]  Chris Eliasmith,et al.  A neural model of hierarchical reinforcement learning , 2017, CogSci.

[12]  Sam Devlin,et al.  Resource Abstraction for Reinforcement Learning in Multiagent Congestion Problems , 2016, AAMAS.

[13]  Kagan Tumer,et al.  Multiagent reinforcement learning in a distributed sensor network with indirect feedback , 2013, AAMAS.

[14]  George A. Vouros,et al.  Multiagent Reinforcement Learning Methods for Resolving Demand - Capacity Imbalances , 2018, 2018 IEEE/AIAA 37th Digital Avionics Systems Conference (DASC).

[15]  Daniel A. Keim,et al.  Visual Analytics of Movement , 2013, Springer Berlin Heidelberg.

[16]  Igal Milchtaich,et al.  Social optimality and cooperation in nonatomic congestion games , 2004, J. Econ. Theory.

[17]  Graham Tanner,et al.  European airline delay cost reference values , 2011 .

[18]  Sam Devlin,et al.  Potential-based difference rewards for multiagent reinforcement learning , 2014, AAMAS.

[19]  Richard M. Karp,et al.  Optimization problems in congestion control , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[20]  Georgios Chalkiadakis,et al.  Learning Policies for Resolving Demand-Capacity Imbalances During Pre-tactical Air Traffic Management , 2017, MATES.

[21]  R. Rosenthal A class of games possessing pure-strategy Nash equilibria , 1973 .

[22]  Thomas G. Dietterich Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[23]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..