Hierarchical multiagent reinforcement learning schemes for air traffic management

In this work we investigate the use of hierarchical multiagent reinforcement learning methods for the computation of policies to resolve congestion problems in the air traffic management domain. To address cases where the demand of airspace use exceeds capacity, we consider agents representing flights, who need to decide on ground delays at the pre-tactical stage of operations, towards executing their trajectories while adhering to airspace capacity constraints. Hierarchical reinforcement learning manages to handle real-world problems with high complexity, by partitioning the task into hierarchies of states and/or actions. This provides an efficient way of exploring the state–action space and constructing an advantageous decision-making mechanism. We first establish a general framework of hierarchical multiagent reinforcement learning, and then, we further formulate four alternative schemes of abstractions, on states, actions, or both. To quantitatively assess the quality of solutions of the proposed approaches and show the potential of the hierarchical methods in resolving the demand–capacity balance problem, we provide experimental results on real-world evaluation cases, where we measure the average delay per flight and the number of flights with delays.

[1]  Peter Stone,et al.  State Abstraction Discovery from Irrelevant State Variables , 2005, IJCAI.

[2]  George A. Vouros,et al.  Resolving Congestions in the Air Traffic Management Domain via Multiagent Reinforcement Learning Methods , 2019, ArXiv.

[3]  Kagan Tumer,et al.  Aligning social welfare and agent preferences to alleviate traffic congestion , 2008, AAMAS.

[4]  Andreas S. Schulz,et al.  Network flow problems and congestion games: complexity and approximation results , 2006 .

[5]  Moshe Tennenholtz,et al.  Congestion games with failures , 2011, Discret. Appl. Math..

[6]  George A. Vouros,et al.  Collaborative multiagent reinforcement learning schemes for air traffic management , 2019, 2019 10th International Conference on Information, Intelligence, Systems and Applications (IISA).

[7]  Igal Milchtaich,et al.  Social optimality and cooperation in nonatomic congestion games , 2004, J. Econ. Theory.

[8]  Pieter Abbeel,et al.  Meta Learning Shared Hierarchies , 2017, ICLR.

[9]  Sridhar Mahadevan,et al.  Hierarchical multi-agent reinforcement learning , 2001, AGENTS '01.

[10]  Shie Mannor,et al.  Dynamic abstraction in reinforcement learning via clustering , 2004, ICML.

[11]  Richard M. Karp,et al.  Optimization problems in congestion control , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[12]  Daniel A. Keim,et al.  Visual Analytics of Movement , 2013, Springer Berlin Heidelberg.

[13]  Chris Eliasmith,et al.  A neural model of hierarchical reinforcement learning , 2017, CogSci.

[14]  Geoffrey E. Hinton,et al.  Feudal Reinforcement Learning , 1992, NIPS.

[15]  George A. Vouros,et al.  Multiagent Reinforcement Learning Methods for Resolving Demand - Capacity Imbalances , 2018, 2018 IEEE/AIAA 37th Digital Avionics Systems Conference (DASC).

[16]  R. Rosenthal A class of games possessing pure-strategy Nash equilibria , 1973 .

[17]  Stuart J. Russell,et al.  Markovian State and Action Abstractions for MDPs via Hierarchical MCTS , 2016, IJCAI.

[18]  Ana L. C. Bazzan,et al.  Agents in Traffic Modelling - From Reactive to Social Behaviour , 1999, KI.

[19]  Jianyu Chen,et al.  Deep Hierarchical Reinforcement Learning for Autonomous Driving with Distinct Behaviors , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[20]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[21]  Michail G. Lagoudakis,et al.  Coordinated Reinforcement Learning , 2002, ICML.

[22]  Joshua B. Tenenbaum,et al.  Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[23]  Kagan Tumer,et al.  Multiagent reinforcement learning in a distributed sensor network with indirect feedback , 2013, AAMAS.

[24]  Kagan Tumer,et al.  A multiagent approach to managing air traffic flow , 2010, Autonomous Agents and Multi-Agent Systems.

[25]  Andrew G. Barto,et al.  Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.

[26]  Michael L. Littman,et al.  Near Optimal Behavior via Approximate State Abstraction , 2016, ICML.

[27]  Glen Berseth,et al.  DeepLoco , 2017, ACM Trans. Graph..

[28]  Doina Precup,et al.  The Option-Critic Architecture , 2016, AAAI.

[29]  Peter Vrancx,et al.  Analysing Congestion Problems in Multi-agent Reinforcement Learning , 2017, AAMAS.

[30]  George A. Vouros,et al.  Multiagent Reinforcement Learning Methods to Resolve Demand Capacity Balance Problems , 2018, SETN.

[31]  Thomas G. Dietterich Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[32]  Sergey Levine,et al.  Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.

[33]  Sam Devlin,et al.  Resource Abstraction for Reinforcement Learning in Multiagent Congestion Problems , 2016, AAMAS.

[34]  Georgios Chalkiadakis,et al.  Learning Policies for Resolving Demand-Capacity Imbalances During Pre-tactical Air Traffic Management , 2017, MATES.

[35]  Nikos A. Vlassis,et al.  Collaborative Multiagent Reinforcement Learning by Payoff Propagation , 2006, J. Mach. Learn. Res..

[36]  Graham Tanner,et al.  European airline delay cost reference values , 2011 .

[37]  Shie Mannor,et al.  A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.

[38]  Andrew G. Barto,et al.  Efficient skill learning using abstraction selection , 2009, IJCAI 2009.

[39]  Jorge Cortes,et al.  Hierarchical reinforcement learning via dynamic subspace search for multi-agent planning , 2019, Autonomous Robots.

[40]  Stuart J. Russell,et al.  Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.

[41]  Hans-Peter Seidel,et al.  Design and volume optimization of space structures , 2017, ACM Trans. Graph..