Towards Robust Human Trajectory Prediction in Raw Videos

Human trajectory prediction has received increased attention lately due to its importance in applications such as autonomous vehicles and indoor robots. However, most existing methods make predictions based on human-labeled trajectories and ignore the errors and noises in detection and tracking. In this paper, we study the problem of human trajectory forecasting in raw videos, and show that the prediction accuracy can be severely affected by various types of tracking errors. Accordingly, we propose a simple yet effective strategy to correct the tracking failures by enforcing prediction consistency over time. The proposed "re-tracking" algorithm can be applied to any existing tracking and prediction pipelines. Experiments on public benchmark datasets demonstrate that the proposed method can improve both tracking and prediction performance in challenging real-world scenarios. The code and data are available at https://git.io/retracking-prediction.

[1]  Luc Van Gool,et al.  WILDTRACK: A Multi-camera HD Dataset for Dense Unscripted Pedestrian Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[2]  Silvio Savarese,et al.  Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes , 2016, ECCV.

[3]  Raquel Urtasun,et al.  End-to-end Contextual Perception and Prediction with Interaction Transformer , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4]  Dinesh Manocha,et al.  AutoTrajectory: Label-free Trajectory Extraction and Prediction from Videos using Dynamic Points , 2020, ECCV.

[5]  Sridha Sridharan,et al.  Tracking by Prediction: A Deep Generative Model for Mutli-person Localisation and Tracking , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[7]  J. Malik,et al.  It Is Not the Journey but the Destination: Endpoint Conditioned Trajectory Prediction , 2020, ECCV.

[8]  Silvio Savarese,et al.  SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Sergio Casas,et al.  Implicit Latent Variable Model for Scene-Consistent Motion Forecasting , 2020, ECCV.

[10]  Jianren Wang,et al.  Inverting the Pose Forecasting Pipeline with SPF2: Sequential Pointcloud Forecasting for Sequential Pose Forecasting , 2020, CoRL.

[11]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[12]  Dariu M. Gavrila,et al.  Human motion trajectory prediction: a survey , 2019, Int. J. Robotics Res..

[13]  Juan Carlos Niebles,et al.  Peeking Into the Future: Predicting Future Person Activities and Locations in Videos , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Alexandre Alahi,et al.  Human Trajectory Forecasting in Crowds: A Deep Learning Perspective , 2020, IEEE Transactions on Intelligent Transportation Systems.

[15]  Luc Van Gool,et al.  You'll never walk alone: Modeling social behavior for multi-target tracking , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16]  Abduallah A. Mohamed,et al.  Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Silvio Savarese,et al.  Social-BiGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks , 2019, NeurIPS.

[18]  Shuai Yi,et al.  Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction , 2020, ECCV.

[19]  Bin Yang,et al.  Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[21]  Sergio Casas,et al.  IntentNet: Learning to Predict Intention from Raw Sensor Data , 2018, CoRL.

[22]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[23]  Fabio Tozeto Ramos,et al.  Simple online and realtime tracking , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[24]  Alessio Del Bue,et al.  MX-LSTM: Mixing Tracklets and Vislets to Jointly Forecast Trajectories and Head Poses , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[26]  Renjie Liao,et al.  SpAGNN: Spatially-Aware Graph Neural Networks for Relational Behavior Forecasting from Sensor Data , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Ramakant Nevatia,et al.  Multi-target tracking by online learning of non-linear motion patterns and robust appearance models , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Luis E. Ortiz,et al.  Who are you with and where are you going? , 2011, CVPR 2011.

[29]  Yoichi Sato,et al.  Future Person Localization in First-Person Videos , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  R. Urtasun,et al.  PnPNet: End-to-End Perception and Prediction With Tracking in the Loop , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Bernt Schiele,et al.  Long-Term On-board Prediction of People in Traffic Scenes Under Uncertainty , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Eric Sommerlade,et al.  Modelling pedestrian trajectory patterns with Gaussian processes , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[33]  Jean Oh,et al.  Social Attention: Modeling Attention in Human Crowds , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[34]  Dinesh Manocha,et al.  BRVO: Predicting pedestrian trajectories using velocity-space reasoning , 2015, Int. J. Robotics Res..

[35]  Luc Van Gool,et al.  Robust tracking-by-detection using a detector confidence particle filter , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[36]  Masayoshi Tomizuka,et al.  Conditional Generative Neural System for Probabilistic Trajectory Prediction , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[37]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Silvio Savarese,et al.  Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Stefan Roth,et al.  MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking , 2015, ArXiv.