An LSTM-Based Autonomous Driving Model Using Waymo Open Dataset

The Waymo Open Dataset has been released recently, providing a platform to crowdsource some fundamental challenges for automated vehicles (AVs), such as 3D detection and tracking. While~the dataset provides a large amount of high-quality and multi-source driving information, people in academia are more interested in the underlying driving policy programmed in Waymo self-driving cars, which is inaccessible due to AV manufacturers' proprietary protection. Accordingly, academic researchers have to make various assumptions to implement AV components in their models or simulations, which may not represent the realistic interactions in real-world traffic. Thus, this paper introduces an approach to learn a long short-term memory (LSTM)-based model for imitating the behavior of Waymo's self-driving model. The proposed model has been evaluated based on Mean Absolute Error (MAE). The experimental results show that our model outperforms several baseline models in driving action prediction. In addition, a visualization tool is presented for verifying the performance of the model.

[1]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[3]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[4]  Dragomir Anguelov,et al.  Scalability in Perception for Autonomous Driving: Waymo Open Dataset , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[6]  Mark Reynolds,et al.  SS-LSTM: A Hierarchical LSTM Model for Pedestrian Trajectory Prediction , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[7]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Sung Wook Baik,et al.  Action Recognition in Video Sequences using Deep Bi-Directional LSTM With CNN Features , 2018, IEEE Access.

[10]  Razvan Pascanu,et al.  Understanding the exploding gradient problem , 2012, ArXiv.

[11]  Paolo Mercorelli Using Fuzzy PD Controllers for Soft Motions in a Car-like Robot , 2018 .

[12]  L. Breiman Stacked Regressions , 1996, Machine Learning.

[13]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[16]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[17]  Jinjun Xiong,et al.  Large-scale short-term urban taxi demand forecasting using deep learning , 2018, 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC).

[18]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).