Data Efficient Reinforcement Learning for Integrated Lateral Planning and Control in Automated Parking System

Reinforcement learning (RL) is a promising direction in automated parking systems (APSs), as integrating planning and tracking control using RL can potentially maximize the overall performance. However, commonly used model-free RL requires many interactions to achieve acceptable performance, and model-based RL in APS cannot continuously learn. In this paper, a data-efficient RL method is constructed to learn from data by use of a model-based method. The proposed method uses a truncated Monte Carlo tree search to evaluate parking states and select moves. Two artificial neural networks are trained to provide the search probability of each tree branch and the final reward for each state using self-trained data. The data efficiency is enhanced by weighting exploration with parking trajectory returns, an adaptive exploration scheme, and experience augmentation with imaginary rollouts. Without human demonstrations, a novel training pipeline is also used to train the initial action guidance network and the state value network. Compared with path planning and path-following methods, the proposed integrated method can flexibly co-ordinate the longitudinal and lateral motion to park a smaller parking space in one maneuver. Its adaptability to changes in the vehicle model is verified by joint Carsim and MATLAB simulation, demonstrating that the algorithm converges within a few iterations. Finally, experiments using a real vehicle platform are used to further verify the effectiveness of the proposed method. Compared with obtaining rewards using simulation, the proposed method achieves a better final parking attitude and success rate.

[1]  Hui Chen,et al.  Surround view based parking lot detection and tracking , 2015, 2015 IEEE Intelligent Vehicles Symposium (IV).

[2]  Oscar Castillo,et al.  Automatic parallel parking algorithm for a car-like robot using fuzzy pd+i control , 2018 .

[3]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[4]  Juan Zhang,et al.  Automatic parking of vehicles: A review of literatures , 2014 .

[5]  Jan Peters,et al.  Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[6]  Christopher D. Rosin,et al.  Multi-armed bandits with episode context , 2011, Annals of Mathematics and Artificial Intelligence.

[7]  Holger Banzhaf,et al.  Learning to Predict Ego-Vehicle Poses for Sampling-Based Nonholonomic Motion Planning , 2019, IEEE Robotics and Automation Letters.

[8]  Sergey Levine,et al.  When to Trust Your Model: Model-Based Policy Optimization , 2019, NeurIPS.

[9]  Shaoyu Song,et al.  Reinforcement Learning-Based Motion Planning for Automatic Parking System , 2020, IEEE Access.

[10]  Myoungho Sunwoo,et al.  Re-Plannable Automated Parking System With a Standalone Around View Monitor for Narrow Parking Lots , 2020, IEEE Transactions on Intelligent Transportation Systems.

[11]  Matthieu Geist,et al.  Approximate modified policy iteration and its application to the game of Tetris , 2015, J. Mach. Learn. Res..

[12]  Zhijiang Shao,et al.  Time-Optimal Maneuver Planning in Automatic Parallel Parking Using a Simultaneous Dynamic Optimization Approach , 2016, IEEE Transactions on Intelligent Transportation Systems.

[13]  Hui Chen,et al.  Study on Robust Motion Planning Method for Automatic Parking Assist System Based on Neural Network and Tree Search , 2019 .

[14]  Steffen Knoop,et al.  The future of parking: A survey on automated valet parking with an outlook on high density parking , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[15]  H. Vincent Poor,et al.  Robust Data Detection for MIMO Systems With One-Bit ADCs: A Reinforcement Learning Approach , 2019, IEEE Transactions on Wireless Communications.

[16]  Hui Chen,et al.  Study on Path Following Control Method for Automatic Parking System Based on LQR , 2016 .

[17]  Alois Knoll,et al.  Experience-Based Heuristic Search: Robust Motion Planning with Deep Q-Learning , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[18]  Simon M. Lucas,et al.  A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.

[19]  Brigitte d'Andréa-Novel,et al.  Easy Path Planning and Robust Control for Automatic Parallel Parking , 2011 .

[20]  Alois Knoll,et al.  Path planning with orientation-aware space exploration guided heuristic search for autonomous parking and maneuvering , 2015, 2015 IEEE Intelligent Vehicles Symposium (IV).

[21]  Hui Chen,et al.  Towards High Accuracy Parking Slot Detection for Automated Valet Parking System , 2019 .

[22]  Kok Kiong Tan,et al.  Autonomous Reverse Parking System Based on Robust Path Generation and Improved Sliding Mode Control , 2015, IEEE Transactions on Intelligent Transportation Systems.

[23]  Qing Su,et al.  AVP-SLAM: Semantic Visual Mapping and Localization for Autonomous Vehicles in the Parking Lot , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[24]  S. Zucker,et al.  Toward Efficient Trajectory Planning: The Path-Velocity Decomposition , 1986 .

[25]  Eduardo Bejar,et al.  Reverse Parking a Car-Like Mobile Robot with Deep Reinforcement Learning and Preview Control , 2019, 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC).

[26]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[27]  Changfu Zong,et al.  Trajectory Planning for Automated Parking Systems Using Deep Reinforcement Learning , 2020 .

[28]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[29]  Wei Jiang,et al.  VH-HFCN based Parking Slot and Lane Markings Segmentation on Panoramic Surround View , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[30]  Seongjin Lee,et al.  Robust Parking Path Planning with Error-Adaptive Sampling under Perception Uncertainty , 2020, Sensors.

[31]  Huafeng Wu,et al.  Augmented Ship Tracking Under Occlusion Conditions From Maritime Surveillance Videos , 2020, IEEE Access.

[32]  Zhuoping Yu,et al.  Reinforcement Learning-Based End-to-End Parking for Automatic Parking System , 2019, Sensors.

[33]  Wei Wang,et al.  Safe Off-Policy Deep Reinforcement Learning Algorithm for Volt-VAR Control in Power Distribution Systems , 2020, IEEE Transactions on Smart Grid.

[34]  Saïd Mammar,et al.  Automatic Parallel Parking in Tiny Spots: Path Planning and Control , 2015, IEEE Transactions on Intelligent Transportation Systems.

[35]  Jie Song,et al.  Laser-based SLAM automatic parallel parking path planning and tracking for passenger vehicle , 2019 .

[36]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.