Graph2Kernel Grid-LSTM: A Multi-Cued Model for Pedestrian Trajectory Prediction by Learning Adaptive Neighborhoods

Pedestrian trajectory prediction is a prominent research track that has advanced towards modelling of crowd social and contextual interactions, with extensive usage of Long Short-Term Memory (LSTM) for temporal representation of walking trajectories. Existing approaches use virtual neighborhoods as a fixed grid for pooling social states of pedestrians with tuning process that controls how social interactions are being captured. This entails performance customization to specific scenes but lowers the generalization capability of the approaches. In our work, we deploy \textit{Grid-LSTM}, a recent extension of LSTM, which operates over multidimensional feature inputs. We present a new perspective to interaction modeling by proposing that pedestrian neighborhoods can become adaptive in design. We use \textit{Grid-LSTM} as an encoder to learn about potential future neighborhoods and their influence on pedestrian motion given the visual and the spatial boundaries. Our model outperforms state-of-the-art approaches that collate resembling features over several publicly-tested surveillance videos. The experiment results clearly illustrate the generalization of our approach across datasets that varies in scene features and crowd dynamics.

[1]  Pietro Lió,et al.  Factorised Neural Relational Inference for Multi-Interaction Systems , 2019, ArXiv.

[2]  Silvio Savarese,et al.  Structural-RNN: Deep Learning on Spatio-Temporal Graphs , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Sridha Sridharan,et al.  Soft + Hardwired Attention: An LSTM Framework for Human Trajectory Prediction and Abnormal Event Detection , 2017, Neural Networks.

[4]  Siew-Kei Lam,et al.  Self-Growing Spatial Graph Networks for Pedestrian Trajectory Prediction , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[5]  Jean Oh,et al.  Social Attention: Modeling Attention in Human Crowds , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Baoguo Li,et al.  SEABIG: A Deep Learning-Based Method for Location Prediction in Pedestrian Semantic Trajectories , 2019, IEEE Access.

[7]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[8]  Alessio Del Bue,et al.  "Seeing is Believing": Pedestrian Trajectory Forecasting Using Visual Frustum of Attention , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[9]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Alessio Del Bue,et al.  Forecasting People Trajectories and Head Poses by Jointly Reasoning on Tracklets and Vislets , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Alessio Del Bue,et al.  MX-LSTM: Mixing Tracklets and Vislets to Jointly Forecast Trajectories and Head Poses , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Mark Reynolds,et al.  Location-Velocity Attention for Pedestrian Trajectory Prediction , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[13]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[14]  Fangkai Yang,et al.  Who are my neighbors?: A perception model for selecting neighbors of pedestrians in crowds , 2018, IVA.

[15]  Mark Reynolds,et al.  SS-LSTM: A Hierarchical LSTM Model for Pedestrian Trajectory Prediction , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[16]  Seul Jung,et al.  Pedestrian trajectory prediction via the Social‐Grid LSTM model , 2018, The Journal of Engineering.

[17]  Nanning Zheng,et al.  SR-LSTM: State Refinement for LSTM Towards Pedestrian Trajectory Prediction , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[19]  Denis Wolf,et al.  Scene Compliant Trajectory Forecast With Agent-Centric Spatio-Temporal Grids , 2019, IEEE Robotics and Automation Letters.

[20]  Behzad Dariush,et al.  Looking to Relations for Future Trajectory Forecast , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Dinesh Manocha,et al.  SocioSense: Robot navigation amongst pedestrians with social and psychological constraints , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22]  Silvio Savarese,et al.  SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Julien Pettré,et al.  Social Ways: Learning Multi-Modal Distributions of Pedestrian Trajectories With GANs , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[24]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[25]  R. Zemel,et al.  Neural Relational Inference for Interacting Systems , 2018, ICML.

[26]  Alex Graves,et al.  Grid Long Short-Term Memory , 2015, ICLR.

[27]  Lamberto Ballan,et al.  Social and Scene-Aware Trajectory Prediction in Crowded Spaces , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[28]  Luis E. Ortiz,et al.  Who are you with and where are you going? , 2011, CVPR 2011.

[29]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[30]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.