论文信息 - ARC: Adversarially Robust Control Policies for Autonomous Vehicles

ARC: Adversarially Robust Control Policies for Autonomous Vehicles

Deep neural networks have demonstrated their capability to learn control policies for a variety of tasks. However, these neural network-based policies have been shown to be susceptible to exploitation by adversarial agents. Therefore, there is a need to develop techniques to learn control policies that are robust against adversaries. We introduce Adversarially Robust Control (ARC), which trains the protagonist policy and the adversarial policy end-to-end on the same loss. The aim of the protagonist is to maximise this loss, whilst the adversary is attempting to minimise it. We demonstrate the proposed ARC training in a highway driving scenario, where the protagonist controls the follower vehicle whilst the adversary controls the lead vehicle. By training the protagonist against an ensemble of adversaries, it learns a significantly more robust control policy, which generalises to a variety of adversarial strategies. The approach is shown to reduce the amount of collisions against new adversaries by up to 90.25%, compared to the original policy. Moreover, by utilising an auxiliary distillation loss, we show that the fine-tuned control policy shows no drop in performance across its original training distribution.

[1] Mykel J. Kochenderfer,et al. Adaptive Stress Testing for Autonomous Vehicles , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[2] Junmo Kim,et al. Less-forgetful Learning for Domain Expansion in Deep Neural Networks , 2017, AAAI.

[3] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[4] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.

[5] Richard Bowden,et al. Training Adversarial Agents to Exploit Weaknesses in Deep Control Policies , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[6] Arslan Munir,et al. Adversarial Reinforcement Learning Framework for Benchmarking Collision Avoidance Mechanisms in Autonomous Vehicles , 2018, IEEE Intelligent Transportation Systems Magazine.

[7] Yang Gao,et al. Risk Averse Robust Adversarial Reinforcement Learning , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[8] Gert Kootstra,et al. International Conference on Robotics and Automation (ICRA) , 2008, ICRA 2008.

[9] Xin Zhang,et al. End to End Learning for Self-Driving Cars , 2016, ArXiv.

[10] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[11] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[12] Abhinav Gupta,et al. Robust Adversarial Reinforcement Learning , 2017, ICML.

[13] Jürgen Schmidhuber,et al. World Models , 2018, ArXiv.

[14] Alan L. Yuille,et al. Feature Denoising for Improving Adversarial Robustness , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Bruno Scherrer,et al. Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games , 2015, ICML.

[16] Yoshua Bengio,et al. An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.

[17] Dean Pomerleau,et al. Efficient Training of Artificial Neural Networks for Autonomous Navigation , 1991, Neural Computation.

[18] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[19] Razvan Pascanu,et al. Sim-to-Real Robot Learning from Pixels with Progressive Nets , 2016, CoRL.

[20] R. French. Catastrophic forgetting in connectionist networks , 1999, Trends in Cognitive Sciences.

[21] Girish Chowdhary,et al. Robust Deep Reinforcement Learning with Adversarial Attacks , 2017, AAMAS.

[22] Saber Fallah,et al. End-to-end Reinforcement Learning for Autonomous Longitudinal Control Using Advantage Actor Critic with Temporal Context , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[23] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[24] Henryk Michalewski,et al. Simulation-Based Reinforcement Learning for Real-World Autonomous Driving , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[25] Derek Hoiem,et al. Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26] Ali Farhadi,et al. Target-driven visual navigation in indoor scenes using deep reinforcement learning , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[27] Mykel J. Kochenderfer,et al. Imitating driver behavior with generative adversarial networks , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[28] Silvio Savarese,et al. Adversarially Robust Policy Learning: Active construction of physically-plausible perturbations , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[29] Alexey Dosovitskiy,et al. End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[30] Jun Morimoto,et al. Robust Reinforcement Learning , 2005, Neural Computation.

[31] Katherine Rose Driggs-Campbell,et al. Improved Robustness and Safety for Autonomous Vehicle Control with Adversarial Reinforcement Learning , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[32] Cewu Lu,et al. Virtual to Real Reinforcement Learning for Autonomous Driving , 2017, BMVC.

[33] Richard Bowden,et al. Safe Deep Neural Network-Driven Autonomous Vehicles Using Software Safety Cages , 2019, IDEAL.

[34] Eder Santana,et al. Exploring the Limitations of Behavior Cloning for Autonomous Driving , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[35] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[36] Wojciech Zaremba,et al. Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[37] Frank Diermeyer,et al. Survey on Scenario-Based Safety Assessment of Automated Vehicles , 2020, IEEE Access.

[38] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.

[39] Mykel J. Kochenderfer,et al. A Survey of Algorithms for Black-Box Safety Validation , 2020, J. Artif. Intell. Res..

[40] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[41] Wenhao Ding,et al. Multimodal Safety-Critical Scenarios Generation for Decision-Making Algorithms Evaluation , 2020, IEEE Robotics and Automation Letters.

[42] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.

[43] Sergey Levine,et al. Adversarial Policies: Attacking Deep Reinforcement Learning , 2019, ICLR.

[44] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[45] Richard Bowden,et al. A Survey of Deep Learning Applications to Autonomous Vehicle Control , 2019, IEEE Transactions on Intelligent Transportation Systems.

[46] Gregory D. Hager,et al. Uncertainty-Aware Occupancy Map Prediction Using Generative Networks for Robot Navigation , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[47] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[48] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.