A Multi-Modal States based Vehicle Descriptor and Dilated Convolutional Social Pooling for Vehicle Trajectory Prediction

Precise trajectory prediction of surrounding vehicles is critical for decision-making of autonomous vehicles and learning-based approaches are well recognized for the robustness. However, state-of-the-art learning-based methods ignore 1) the feasibility of the vehicle's multi-modal state information for prediction and 2) the mutual exclusive relationship between the global traffic scene receptive fields and the local position resolution when modeling vehicles' interactions, which may influence prediction accuracy. Therefore, we propose a vehicle-descriptor based LSTM model with the dilated convolutional social pooling (VD+DCS-LSTM) to cope with the above issues. First, each vehicle's multi-modal state information is employed as our model's input and a new vehicle descriptor encoded by stacked sparse auto-encoders is proposed to reflect the deep interactive relationships between various states, achieving the optimal feature extraction and effective use of multi-modal inputs. Secondly, the LSTM encoder is used to encode the historical sequences composed of the vehicle descriptor and a novel dilated convolutional social pooling is proposed to improve modeling vehicles' spatial interactions. Thirdly, the LSTM decoder is used to predict the probability distribution of future trajectories based on maneuvers. The validity of the overall model was verified over the NGSIM US-101 and I-80 datasets and our method outperforms the latest benchmark.

[1]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[2]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[3]  L. Petersson,et al.  Monte Carlo based Threat Assessment: Analysis and Improvements , 2007, 2007 IEEE Intelligent Vehicles Symposium.

[4]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[5]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[6]  Mohan M. Trivedi,et al.  Convolutional Social Pooling for Vehicle Trajectory Prediction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[7]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[8]  Mohan M. Trivedi,et al.  How Would Surround Vehicles Move? A Unified Framework for Maneuver Classification and Motion Prediction , 2018, IEEE Transactions on Intelligent Vehicles.

[9]  Mohan M. Trivedi,et al.  Multi-Modal Trajectory Prediction of Surrounding Vehicles with Maneuver based LSTMs , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[10]  Seunghoon Hong,et al.  Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Sorin A. Huss,et al.  Predictive maneuver evaluation for enhancement of Car-to-X mobility data , 2012, 2012 IEEE Intelligent Vehicles Symposium.

[12]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[14]  Christian Laugier,et al.  Exploiting map information for driver intention estimation at road intersections , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[15]  Feng Jia,et al.  An Intelligent Fault Diagnosis Method Using Unsupervised Feature Learning Towards Mechanical Big Data , 2016, IEEE Transactions on Industrial Electronics.

[16]  Mykel J. Kochenderfer,et al.  Imitating driver behavior with generative adversarial networks , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[17]  Hongming Zhou,et al.  Stacked Extreme Learning Machines , 2015, IEEE Transactions on Cybernetics.

[18]  Wei Zhan,et al.  Probabilistic Prediction of Vehicle Semantic Intention and Motion , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[19]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[20]  Aaas News,et al.  Book Reviews , 1893, Buffalo Medical and Surgical Journal.

[21]  Krzysztof Czarnecki,et al.  A behavior driven approach for sampling rare event situations for autonomous vehicles , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22]  Silvio Savarese,et al.  Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Mayank Bansal,et al.  ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst , 2018, Robotics: Science and Systems.

[24]  W. Marsden I and J , 2012 .

[25]  Vikas Singh,et al.  Dilated Convolutional Neural Networks for Sequential Manifold-Valued Data , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Wei Xiong,et al.  Stacked Convolutional Denoising Auto-Encoders for Feature Representation , 2017, IEEE Transactions on Cybernetics.

[27]  Natasha Merat,et al.  Behavioural changes in drivers experiencing highly-automated vehicle control in varying traffic conditions , 2013 .