Reinforcement Learning for Autonomous Driving with Latent State Inference and Spatial-Temporal Relationships

Deep reinforcement learning (DRL) provides a promising way for learning navigation in complex autonomous driving scenarios. However, identifying the subtle cues that can indicate drastically different outcomes remains an open problem with designing autonomous systems that operate in human environments. In this work, we show that explicitly inferring the latent state and encoding spatial-temporal relationships in a reinforcement learning framework can help address this difficulty. We encode prior knowledge on the latent states of other drivers through a framework that combines the reinforcement learner with a supervised learner. In addition, we model the influence passing between different vehicles through graph neural networks (GNNs). The proposed framework significantly improves performance in the context of navigating T-intersections compared with state-of-the-art baseline approaches.

[1]  Fergal Cotter,et al.  Probabilistic Future Prediction for Video Scene Understanding , 2020, ECCV.

[2]  Mykel J. Kochenderfer,et al.  Simultaneous policy learning and latent state inference for imitating driver behavior , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[3]  Raul Vicente,et al.  Do deep reinforcement learning agents model intentions? , 2018, Stats.

[4]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[5]  Nasser L. Azad,et al.  Comparison of Deep Reinforcement Learning and Model Predictive Control for Adaptive Cruise Control , 2019, ArXiv.

[6]  David Isele,et al.  Reinforcement Learning with Iterative Reasoning for Merging in Dense Traffic , 2020, 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC).

[7]  Abduallah A. Mohamed,et al.  Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Anca D. Dragan,et al.  Planning for Autonomous Cars that Leverage Effects on Human Actions , 2016, Robotics: Science and Systems.

[9]  Dirk Helbing,et al.  General Lane-Changing Model MOBIL for Car-Following Models , 2007 .

[10]  Maxim Likhachev,et al.  Driving in Dense Traffic with Model-Free Reinforcement Learning , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[12]  Bin Yang,et al.  Multi-Task Multi-Sensor Fusion for 3D Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Mathias Perrollaz,et al.  Learning-based approach for online lane change intention prediction , 2013, 2013 IEEE Intelligent Vehicles Symposium (IV).

[14]  Sanja Fidler,et al.  NerveNet: Learning Structured Policy with Graph Neural Networks , 2018, ICLR.

[15]  David Isele,et al.  Interaction-Aware Multi-Agent Reinforcement Learning for Mobile Agents with Individual Goals , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[16]  Francesco Borrelli,et al.  Scenario Model Predictive Control for Lane Change Assistance and Autonomous Driving on Highways , 2017, IEEE Intelligent Transportation Systems Magazine.

[17]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[18]  Masayoshi Tomizuka,et al.  Social-WaGDAT: Interaction-aware Trajectory Prediction via Wasserstein Graph Double-Attention Network , 2020, ArXiv.

[19]  Zhaoxin Li,et al.  STGAT: Modeling Spatial-Temporal Interactions for Human Trajectory Prediction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Weilong Song,et al.  Intention-Aware Autonomous Driving Decision-Making in an Uncontrolled Intersection , 2016 .

[21]  Marco Körner,et al.  Auxiliary Tasks in Multi-task Learning , 2018, ArXiv.

[22]  Tom Schaul,et al.  Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.

[23]  Alexey Dosovitskiy,et al.  End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[25]  Guillaume Lample,et al.  Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.

[26]  Masayoshi Tomizuka,et al.  Model-free Deep Reinforcement Learning for Urban Autonomous Driving , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[27]  John M. Dolan,et al.  Intention estimation for ramp merging control in autonomous driving , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[28]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[29]  Chongyang Zhang,et al.  Leveraging Heterogeneous Auxiliary Tasks to Assist Crowd Counting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[31]  Jonathan P. How,et al.  Decision Making Under Uncertainty: Theory and Application , 2015 .

[32]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[33]  Sergey Levine,et al.  Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[34]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[36]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[37]  David Hsu,et al.  Intention-aware online POMDP planning for autonomous driving in a crowd , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[38]  Dragomir Anguelov,et al.  VectorNet: Encoding HD Maps and Agent Dynamics From Vectorized Representation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Efstathios Velenis,et al.  An ensemble deep learning approach for driver lane change intention inference , 2020, Transportation Research Part C: Emerging Technologies.

[40]  Dirk Helbing,et al.  Enhanced intelligent driver model to access the impact of driving strategies on traffic capacity , 2009, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[41]  David Isele,et al.  Navigating Occluded Intersections with Autonomous Vehicles Using Deep Reinforcement Learning , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[42]  David Isele,et al.  CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning , 2018, ICLR.

[43]  M. Tomizuka,et al.  EvolveGraph: Multi-Agent Trajectory Prediction with Dynamic Relational Reasoning , 2020, NeurIPS.

[44]  Bo Wahlberg,et al.  Clothoid-based model predictive control for autonomous driving , 2015, 2015 European Control Conference (ECC).