RNN-based Pedestrian Crossing Prediction using Activity and Pose-related Features

Pedestrian crossing prediction is a crucial task for autonomous driving. Numerous studies show that an early estimation of the pedestrian's intention can decrease or even avoid a high percentage of accidents. In this paper, different variations of a deep learning system are proposed to attempt to solve this problem. The proposed models are composed of two parts: a CNN-based feature extractor and an RNN module. All the models were trained and tested on the JAAD dataset. The results obtained indicate that the choice of the features extraction method, the inclusion of additional variables such as pedestrian gaze direction and discrete orientation, and the chosen RNN type have a significant impact on the final performance.

[1]  Yutaka Satoh,et al.  Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[3]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[5]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[6]  Dariu Gavrila,et al.  Context-Based Path Prediction for Targets with Switching Dynamics , 2018, International Journal of Computer Vision.

[7]  Arun Ross,et al.  Forecasting Pedestrian Trajectory with Machine-Annotated Training Data , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[8]  John K. Tsotsos,et al.  Are They Going to Cross? A Benchmark Dataset and Baseline for Pedestrian Crosswalk Behavior , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[9]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[10]  Dariu M. Gavrila,et al.  Human motion trajectory prediction: a survey , 2019, Int. J. Robotics Res..

[11]  C. G. Keller,et al.  Will the Pedestrian Cross? A Study on Pedestrian Path Prediction , 2014, IEEE Transactions on Intelligent Transportation Systems.

[12]  Miguel Ángel Sotelo,et al.  Pedestrian Path, Pose, and Intention Prediction Through Gaussian Process Dynamical Models and Pedestrian Activity Recognition , 2019, IEEE Transactions on Intelligent Transportation Systems.

[13]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  John K. Tsotsos,et al.  Joint Attention in Autonomous Driving (JAAD) , 2016, ArXiv.

[15]  Martin Lauer,et al.  Pedestrian Prediction by Planning Using Deep Neural Networks , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[17]  Jieping Ye,et al.  Object Detection in 20 Years: A Survey , 2019, Proceedings of the IEEE.

[18]  Jonathan Krause,et al.  3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[19]  Klaas De Brucker,et al.  The economic evaluation of road safety in the European Union , 1997 .

[20]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[21]  John K. Tsotsos,et al.  PIE: A Large-Scale Dataset and Models for Pedestrian Intention Estimation and Trajectory Prediction , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Saeid Nahavandi,et al.  Real-time Intent Prediction of Pedestrians for Autonomous Ground Vehicles via Spatio-Temporal DenseNet , 2019, 2019 International Conference on Robotics and Automation (ICRA).