Reward functions for learning to control in air traffic flow management

Air Traffic Flow Management (ATFM) is a complex decision-making process with multiple stakeholders involved. In this decision loop, a Multi-agent system is developed for both simulation and daily operations to support human decisions. Considering human factors in ATFM, the method of Reinforcement Learning (RL) is suitable in the acquirement of the knowledge and experience of the controllers to assist them in the next control activities. The paper presents the recent development of reinforcement learning and its reward structure for ATFM decision making. Two types of reward functions are proposed for agent-based RL in the application of air traffic management: (1) Reward function considering safety separation and fairness impact among different commercial entities in Ground Holding Problem (GHP) and (2) Reward function considering safety separation in Air Holding Problem (AHP). Real case studies in Brazil are described to show the effectiveness and efficiency of the developed reward functions in the controller decision process of ATFM.

[1]  Lili Wang,et al.  SHORT-TERM FLOW MANAGEMENT BASED ON DYNAMIC FLOW PROGRAMMING NETWORK , 2005 .

[2]  Gerhard Weiss,et al.  Multiagent systems: a modern approach to distributed artificial intelligence , 1999 .

[3]  Mark Hansen,et al.  A Dynamic Stochastic Model for the Single Airport Ground Holding Problem , 2007, Transp. Sci..

[4]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[5]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[6]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[7]  Li Weigang,et al.  Intelligent computing methods in Air Traffic Flow Management , 2010 .

[8]  Kagan Tumer,et al.  Distributed agent-based air traffic flow management , 2007, AAMAS '07.

[9]  Weigang Li,et al.  Fairness analysis with cost impact for Brasilia's Flight Information Region using reinforcement learning approach , 2010, 13th International IEEE Conference on Intelligent Transportation Systems.

[10]  Michael O. Ball,et al.  Ground Delay Program Planning under Uncertainty Based on the Ration-by-Distance Principle , 2007, Transp. Sci..

[11]  Lucio Bianco,et al.  New concepts and methods in air traffic management , 2001 .

[12]  Amedeo R. Odoni,et al.  Dynamic solution to the ground-holding problem in air traffic control , 1994 .

[13]  Andrew G. Barto,et al.  Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.

[14]  A. Odoni The Flow Management Problem in Air Traffic Control , 1987 .

[15]  Kagan Tumer,et al.  Learning Indirect Actions in Complex Domains: Action Suggestions for Air Traffic Control , 2009, Adv. Complex Syst..

[16]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.

[17]  Mark Hansen,et al.  Scenario-based air traffic flow management: From theory to practice , 2008 .

[18]  Balázs Kotnyek,et al.  Equitable Models for the Stochastic Ground-Holding Problem Under Collaborative Decision Making , 2006, Transp. Sci..

[19]  Amedeo R. Odoni,et al.  A Stochastic Integer Program with Dual Network Structure and Its Application to the Ground-Holding Problem , 2003, Oper. Res..

[20]  Maarten Sierhuis,et al.  A Multi-Agent Simulation of Collaborative Air Traffic Flow Management , 2009, Multi-Agent Systems for Traffic and Transportation Engineering.

[21]  Li Weigang,et al.  Balance Modelling and Implementation of Flow Balance for Application in Air Traffic Management , 2010 .

[22]  Hartmut Fricke,et al.  Air traffic control complexity as workload driver , 2010 .

[23]  Manuela M. Veloso,et al.  Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.

[24]  Shawn R. Wolfe Supporting Air Traffic Flow Management with Agents , 2007, Interaction Challenges for Intelligent Assistants.

[25]  Alexandre G. de Barros,et al.  Reinforcement learning agents to tactical air traffic flow management , 2012 .

[26]  Alexandre M. Bayen,et al.  Lagrangian Delay Predictive Model for Sector-Based Air Traffic Flow , 2005 .

[27]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[28]  Li Weigang,et al.  Approach of Balancing of the Negotiation among Agents in Traffic Synchronization , 2007, IEEE Latin America Transactions.

[29]  Lance Sherry,et al.  Automation for task analysis of next generation air traffic management systems , 2010 .