Hierarchical Program-Triggered Reinforcement Learning Agents for Automated Driving

Recent advances in Reinforcement Learning (RL) combined with Deep Learning (DL) have demonstrated impressive performance in complex tasks, including autonomous driving. The use of RL agents in autonomous driving leads to a smooth human-like driving experience, but the limited interpretability of Deep Reinforcement Learning (DRL) creates a verification and certification bottleneck. Instead of relying on RL agents to learn complex tasks, we propose HPRL - Hierarchical Program-triggered Reinforcement Learning, which uses a hierarchy consisting of a structured program along with multiple RL agents, each trained to perform a relatively simple task. The focus of verification shifts to the master program under simple guarantees from the RL agents, leading to a significantly more interpretable and verifiable implementation as compared to a complex RL agent. The evaluation of the framework is demonstrated on different driving tasks, and National Highway Traffic Safety Administration (NHTSA) pre-crash scenarios using CARLA, an open-source dynamic urban simulation environment.

[1]  Elmira Amirloo Abolfathi,et al.  Towards Practical Hierarchical Reinforcement Learning for Multi-lane Autonomous Driving , 2018 .

[2]  Mark Elshaw,et al.  Pedestrian and Cyclist Detection and Intent Estimation for Autonomous Vehicles: A Survey , 2019, Applied Sciences.

[3]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[4]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[5]  Jian Zhang,et al.  Towards Automatic Construction of Diverse, High-Quality Image Datasets , 2017, IEEE Transactions on Knowledge and Data Engineering.

[6]  Peter Stone,et al.  A synthesis of automated planning and reinforcement learning for efficient, robust decision-making , 2016, Artif. Intell..

[7]  Sen Wang,et al.  Deep Reinforcement Learning for Autonomous Driving , 2018, ArXiv.

[8]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[9]  Christel Baier,et al.  Principles of model checking , 2008 .

[10]  Wenjun Wang,et al.  Markov probabilistic decision making of self-driving cars in highway with random traffic flow: a simulation study , 2018, Journal of Intelligent and Connected Vehicles.

[11]  Peter Müller,et al.  Nagini: A Static Verifier for Python , 2018, CAV.

[12]  Fangkai Yang,et al.  SDRL: Interpretable and Data-efficient Deep Reinforcement Learning Leveraging Symbolic Planning , 2018, AAAI.

[13]  Kevin Warwick,et al.  Motion planning of autonomous vehicles in a non-autonomous vehicle environment without speed lanes , 2013, Eng. Appl. Artif. Intell..

[14]  John M. Dolan,et al.  Attention-based Hierarchical Deep Reinforcement Learning for Lane Change Behaviors in Autonomous Driving , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Keith Redmill,et al.  Integrating Deep Reinforcement Learning with Model-based Path Planners for Automated Driving , 2020, 2020 IEEE Intelligent Vehicles Symposium (IV).

[16]  Fred Kröger,et al.  Temporal Logic of Programs , 1987, EATCS Monographs on Theoretical Computer Science.

[17]  Yang Guan,et al.  Hierarchical reinforcement learning for self- driving decision-making without reliance on labelled driving data , 2020 .

[18]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[19]  Joseph J. Lim,et al.  Program Guided Agent , 2020, ICLR.

[20]  Gerard J. Holzmann,et al.  Software model checking with SPIN , 2005, Adv. Comput..

[21]  Amnon Shashua,et al.  Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.

[22]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[23]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[24]  Amnon Shashua,et al.  On a Formal Model of Safe and Scalable Self-driving Cars , 2017, ArXiv.

[25]  Etienne Perot,et al.  Deep Reinforcement Learning framework for Autonomous Driving , 2017, Autonomous Vehicles and Machines.

[26]  Yi Zhang,et al.  Human-like Autonomous Vehicle Speed Control by Deep Reinforcement Learning with Double Q-Learning , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[27]  Kikuo Fujimura,et al.  Tactical Decision Making for Lane Changing with Deep Reinforcement Learning , 2017 .

[28]  Masayoshi Tomizuka,et al.  Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[29]  David Janz,et al.  Learning to Drive in a Day , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[30]  Fabio Somenzi,et al.  Efficient Büchi Automata from LTL Formulae , 2000, CAV.

[31]  Parasara Sridhar Duggirala,et al.  Formalizing traffic rules for uncontrolled intersections , 2020, 2020 ACM/IEEE 11th International Conference on Cyber-Physical Systems (ICCPS).

[32]  Fangkai Yang,et al.  PEORL: Integrating Symbolic Planning and Hierarchical Reinforcement Learning for Robust Decision-Making , 2018, IJCAI.

[33]  Joshua B. Tenenbaum,et al.  Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.

[34]  Marco Pistore,et al.  NuSMV 2: An OpenSource Tool for Symbolic Model Checking , 2002, CAV.

[35]  Benoit Vanholme,et al.  Maneuver-Based Trajectory Planning for Highly Autonomous Vehicles on Real Road With Traffic and Driver Interaction , 2010, IEEE Transactions on Intelligent Transportation Systems.

[36]  Sebastian Thrun,et al.  Junior: The Stanford entry in the Urban Challenge , 2008, J. Field Robotics.

[37]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[38]  Aviral Shrivastava,et al.  Encoding and monitoring responsibility sensitive safety rules for automated vehicles in signal temporal logic , 2019, MEMOCODE.

[39]  Alfons Laarman,et al.  LTSmin: High-Performance Language-Independent Model Checking , 2015, TACAS.