Enhance Sample Efficiency and Robustness of End-to-end Urban Autonomous Driving via Semantic Masked World Model

End-to-end autonomous driving provides a feasible way to automatically maximize overall driving system performance by directly mapping the raw pixels from a front-facing camera to control signals. Recent advanced methods construct a latent world model to map the high dimensional observations into compact latent space. However, the latent states embedded by the world model proposed in previous works may contain a large amount of task-irrelevant information, resulting in low sampling efficiency and poor robustness to input perturbations. Meanwhile, the training data distribution is usually unbalanced, and the learned policy is hard to cope with the corner cases during the driving process. To solve the above challenges, we present a semantic masked recurrent world model (SEM2), which introduces a latent filter to extract key task-relevant features and reconstruct a semantic mask via the filtered features, and is trained with a multi-source data sampler, which aggregates common data and multiple corner case data in a single batch, to balance the data distribution. Extensive experiments on CARLA show that our method outperforms the state-of-the-art approaches in terms of sample efficiency and robustness to input permutations.

[1]  Shengbo Eben Li,et al.  Steadily Learn to Drive with Virtual Memory , 2021, Proceedings of the 11th Asia-Pacific Regional Conference of the ISTVS.

[2]  Victor Talpaert,et al.  Deep Reinforcement Learning for Autonomous Driving: A Survey , 2020, IEEE Transactions on Intelligent Transportation Systems.

[3]  Masayoshi Tomizuka,et al.  Interpretable End-to-End Urban Autonomous Driving With Latent Deep Reinforcement Learning , 2020, IEEE Transactions on Intelligent Transportation Systems.

[4]  Yi Xiao,et al.  Multimodal End-to-End Autonomous Driving , 2019, IEEE Transactions on Intelligent Transportation Systems.

[5]  Jingda Wu,et al.  Uncertainty-Aware Model-Based Reinforcement Learning with Application to Autonomous Driving , 2021, ArXiv.

[6]  Shengbo Eben Li,et al.  Model-based Constrained Reinforcement Learning using Generalized Control Barrier Function , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[7]  Mohammad Norouzi,et al.  Mastering Atari with Discrete World Models , 2020, ICLR.

[8]  Jingliang Duan,et al.  Integrated Decision and Control: Towards Interpretable and Efficient Driving Intelligence , 2021, ArXiv.

[9]  Honglak Lee,et al.  Bridging Imagination and Reality for Model-Based Deep Reinforcement Learning , 2020, NeurIPS.

[10]  Jimmy Ba,et al.  Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.

[11]  Sergey Levine,et al.  Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model , 2019, NeurIPS.

[12]  2019 IEEE Intelligent Transportation Systems Conference (ITSC) , 2019 .

[13]  Chen Change Loy,et al.  Learning Lightweight Lane Detection CNNs by Self Attention Distillation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[14]  Masayoshi Tomizuka,et al.  Model-free Deep Reinforcement Learning for Urban Autonomous Driving , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[15]  Masayoshi Tomizuka,et al.  Autonomous Driving Motion Planning With Constrained Iterative LQR , 2019, IEEE Transactions on Intelligent Vehicles.

[16]  Ruben Villegas,et al.  Learning Latent Dynamics for Planning from Pixels , 2018, ICML.

[17]  Kazuhide Okamoto,et al.  Optimal Stochastic Vehicle Path Planning Using Covariance Steering , 2018, IEEE Robotics and Automation Letters.

[18]  Sen Wang,et al.  Deep Reinforcement Learning for Autonomous Driving , 2018, ArXiv.

[19]  Jürgen Schmidhuber,et al.  Recurrent World Models Facilitate Policy Evolution , 2018, NeurIPS.

[20]  Luc Van Gool,et al.  Towards End-to-End Lane Detection: an Instance Segmentation Approach , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[21]  Alexey Dosovitskiy,et al.  End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Byron Boots,et al.  Agile Autonomous Driving using End-to-End Deep Imitation Learning , 2017, Robotics: Science and Systems.

[23]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[24]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[25]  Christoph Stiller,et al.  Decision making for autonomous driving considering interaction and uncertain prediction of surrounding vehicles , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[26]  Johann Marius Zöllner,et al.  Learning how to drive in a real world simulation with deep Q-Networks , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[27]  John F. Canny,et al.  Interpretable Learning for Self-Driving Cars by Visualizing Causal Attention , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[29]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[30]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[31]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[32]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[33]  Sang-Woo Lee,et al.  Development of lateral control system for autonomous vehicle based on adaptive pure pursuit algorithm , 2014, 2014 14th International Conference on Control, Automation and Systems (ICCAS 2014).

[34]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[35]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.