Safe and feasible motion generation for autonomous driving via constrained policy net

Policy networks have great potential to learn sophisticated driving policy under complicated interaction between human drivers. However, it is hard for policy networks to satisfy safety and feasibility constraints, which is not a challenging task for conventional motion generation methods, such as optimization-based approach. In this paper, we propose Constrained Policy Net (CPN), which can learn safe and feasible driving policy from arbitrary inequality-constrained optimization-based expert planners. Instead of supervised learning with L2 norm as the loss, we incorporate the domain knowledge of the expert planner directly into the training loss of the policy net by applying barrier functions to the safety and feasibility constraints of the optimization problem. An exemplar scenario with obstacles on both sides is used to implement the proposed CPN. Test results demonstrate that the policy net can learn to generate motions near boundaries of safety and feasibility constraints to achieve high driving quality as the baseline optimization while the constraints are satisfied.

[1]  Florent Altché,et al.  High-speed trajectory planning for autonomous vehicles using a simple dynamic model , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[2]  Wei Zhan,et al.  Spatially-partitioned environmental representation and planning architecture for on-road autonomous driving , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[3]  Andreas Hansen,et al.  Data Collection for Robust End-to-End Lateral Vehicle Control , 2017 .

[4]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[5]  Trevor Darrell,et al.  Constrained Convolutional Neural Networks for Weakly Supervised Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Jie Chen,et al.  A novel vehicle dynamics stability control algorithm based on the hierarchical strategy with constrain of nonlinear tyre forces , 2015 .

[7]  Yang Gao,et al.  End-to-End Learning of Driving Models from Large-Scale Video Datasets , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Wei Zhan,et al.  A non-conservatively defensive strategy for urban autonomous driving , 2016, 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC).

[9]  Wei Zhan,et al.  Constrained iterative LQR for on-road autonomous driving motion planning , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[10]  Yann LeCun,et al.  Deep multi-scale video prediction beyond mean square error , 2015, ICLR.

[11]  Jitendra Malik,et al.  Learning Visual Predictive Models of Physics for Playing Billiards , 2015, ICLR.

[12]  Amnon Shashua,et al.  On the Sample Complexity of End-to-end Training vs. Semantic Abstraction Training , 2016, ArXiv.

[13]  Amnon Shashua,et al.  Long-term Planning by Short-term Prediction , 2016, ArXiv.

[14]  Amnon Shashua,et al.  Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.

[15]  Jianxiong Xiao,et al.  DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  Julius Ziegler,et al.  Trajectory planning for Bertha — A local, continuous method , 2014, 2014 IEEE Intelligent Vehicles Symposium Proceedings.

[17]  Wei Zhan,et al.  A Fast Integrated Planning and Control Framework for Autonomous Driving via Imitation Learning , 2017, Volume 3: Modeling and Validation; Multi-Agent and Networked Systems; Path Planning and Motion Control; Tracking Control Systems; Unmanned Aerial Vehicles (UAVs) and Application; Unmanned Ground and Aerial Vehicles; Vibration in Mechanical Systems; Vibrat.

[18]  Sergey Levine,et al.  PLATO: Policy learning using adaptive trajectory optimization , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Wenshuo Wang,et al.  Feature analysis and selection for training an end-to-end autonomous vehicle controller using deep learning approach , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).