Analysis of the Effect of Various Input Representations for LSTM-Based Trajectory Prediction

The prediction of future trajectories of the surrounding traffic participants is a key component in modern autonomous driving systems. This work presents an analysis of the impact of various representations of the input data on the prediction quality. The analyzed data comprises information recorded by the ego vehicle, including object recognition and object tracking, as well as satellite images and map information. We propose a neural network utilizing long short-term memories (LSTMs) to capture the sequence-to-sequence nature of the underlying problem, as well as a convolutional neural network (CNN) to take the surroundings of the predicted object into account. The input to our network is both the past trajectory of the predicted object, as well as a bird’s eye representation of the scene surrounding the object, fusing various types of information on the scene, e.g., a satellite image and bounding boxes of other traffic participants. We achieve Euclidean distances between the predicted position and the ground truth position of 0.47 m and 6.19 m for a prediction time instant that is 1 s and 6 s in the future, respectively. Additionally, we show the potential of our approach to transfer knowledge from similar road topologies to unseen intersections.

[1]  Hannes Sommer,et al.  Predicting actions to act predictably: Cooperative partial motion planning with maximum entropy models , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[2]  Hannes Sommer,et al.  A Data-driven Model for Interaction-Aware Pedestrian Motion Prediction in Object Cluttered Environments , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Chung Choo Chung,et al.  Probabilistic vehicle trajectory prediction over occupancy grid map via recurrent neural network , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[4]  Mohan M. Trivedi,et al.  Multi-Modal Trajectory Prediction of Surrounding Vehicles with Maneuver based LSTMs , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[5]  Silvio Savarese,et al.  Single-source Attention Path Prediction Multi-source Attention Predicted Observed , 2018 .

[6]  Philip H. S. Torr,et al.  DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Henggang Cui,et al.  Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[9]  Julian Eggert,et al.  Using Context Information and Probabilistic Classification for Making Extended Long-Term Trajectory Predictions , 2015, 2015 IEEE 18th International Conference on Intelligent Transportation Systems.

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Florent Altché,et al.  An LSTM network for highway trajectory prediction , 2017, 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC).

[12]  Stewart Worrall,et al.  Naturalistic Driver Intention and Path Prediction Using Recurrent Neural Networks , 2018, IEEE Transactions on Intelligent Transportation Systems.

[13]  Henggang Cui,et al.  Short-term Motion Prediction of Traffic Actors for Autonomous Driving using Deep Convolutional Networks , 2018 .

[14]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Anil K. Jain,et al.  A modified Hausdorff distance for object matching , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[16]  Chung Choo Chung,et al.  Sequence-to-Sequence Prediction of Vehicle Trajectory via LSTM Encoder-Decoder Architecture , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[17]  Shenghua Gao,et al.  Encoding Crowd Interaction with Deep Neural Network for Pedestrian Trajectory Prediction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.