Motion Planning in Dynamic Environments Using Context-Aware Human Trajectory Prediction

Over the years, the separate fields of motion planning, mapping, and human trajectory prediction have advanced considerably. However, the literature is still sparse in providing practical frameworks that enable mobile manipulators to perform whole-body movements and account for the predicted motion of moving obstacles. Previous optimisationbased motion planning approaches that use distance fields have suffered from the high computational cost required to update the environment representation. We demonstrate that GPU-accelerated predicted composite distance fields significantly reduce the computation time compared to calculating distance fields from scratch. We integrate this technique with a complete motion planning and perception framework that accounts for the predicted motion of humans in dynamic environments, enabling reactive and pre-emptive motion planning that incorporates predicted motions. To achieve this, we propose and implement a novel human trajectory prediction method that combines intention recognition with trajectory optimisation-based motion planning. We validate our resultant framework on a real-world Toyota Human Support Robot (HSR) using live RGB-D sensor data from the onboard camera. In addition to providing analysis on a publicly available dataset, we release the Oxford Indoor Human Motion (Oxford-IHM) dataset and demonstrate stateof-the-art performance in human trajectory prediction. The Oxford-IHM dataset is a human trajectory prediction dataset in which people walk between regions of interest in an indoor environment. Both static and robot-mounted RGB-D cameras observe the people while tracked with a motion-capture system.

[1]  Luxin Han,et al.  FIESTA: Fast Incremental Euclidean Distance Fields for Online Motion Planning of Aerial Robots , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[2]  Jingtao Zhang,et al.  Mask R-CNN Based Semantic RGB-D SLAM for Dynamic Scenes , 2019, 2019 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM).

[3]  Rainer Stiefelhagen,et al.  Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics , 2008, EURASIP J. Image Video Process..

[4]  Peter X. Liu,et al.  Moving Object Segmentation and Detection for Robust RGBD-SLAM in Dynamic Environments , 2021, IEEE Transactions on Instrumentation and Measurement.

[5]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[6]  Stefan Leutenegger,et al.  ElasticFusion: Dense SLAM Without A Pose Graph , 2015, Robotics: Science and Systems.

[7]  Tiow Seng Tan,et al.  Parallel Banding Algorithm to compute exact distance transform with the GPU , 2010, I3D '10.

[8]  Stefan Kohlbrecher,et al.  A flexible and scalable SLAM system with full 3D motion estimation , 2011, 2011 IEEE International Symposium on Safety, Security, and Rescue Robotics.

[9]  Byron Boots,et al.  Online Motion Planning Over Multiple Homotopy Classes with Gaussian Process Inference , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10]  Wolfgang Merkt,et al.  Simultaneous Scene Reconstruction and Whole-Body Motion Planning for Safe Operation in Dynamic Environments , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[12]  Fawzi Nashashibi,et al.  A Cooperative Car-Following/Emergency Braking System With Prediction-Based Pedestrian Avoidance Capabilities , 2019, IEEE Transactions on Intelligent Transportation Systems.

[13]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  F. Dellaert Factor Graphs and GTSAM: A Hands-on Introduction , 2012 .

[15]  Takayuki Kanda,et al.  Person Tracking in Large Public Spaces Using 3-D Range Sensors , 2013, IEEE Transactions on Human-Machine Systems.

[16]  Rüdiger Dillmann,et al.  Anticipate your surroundings: Predictive collision detection between dynamic obstacles and planned robot trajectories on the GPU , 2015, 2015 European Conference on Mobile Robots (ECMR).

[17]  Dmitry Berenson,et al.  Human-robot collaborative manipulation planning using early prediction of human motion , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Shuai Yi,et al.  Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction , 2020, ECCV.

[19]  T. Kanda,et al.  Social force model with explicit collision prediction , 2011 .

[20]  Wojciech Matusik,et al.  Gaze360: Physically Unconstrained Gaze Estimation in the Wild , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Lynne E. Parker,et al.  4-dimensional local spatio-temporal features for human activity recognition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Rüdiger Dillmann,et al.  Fast online collision avoidance for mobile service robots through potential fields on 3D environment data processed on GPUs , 2017, 2017 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[23]  Dieter Fox,et al.  A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[24]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[25]  Dinesh Manocha,et al.  PORCA: Modeling and Planning for Autonomous Driving Among Many Pedestrians , 2018, IEEE Robotics and Automation Letters.

[26]  Kai Oliver Arras,et al.  Joint Long-Term Prediction of Human Motion Using a Planning-Based Social Force Approach , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Otmar Hilliges,et al.  Learning Human Motion Models for Long-Term Predictions , 2017, 2017 International Conference on 3D Vision (3DV).

[28]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Dariu M. Gavrila,et al.  Human motion trajectory prediction: a survey , 2019, Int. J. Robotics Res..

[30]  Mario Zanon,et al.  A Computationally Efficient Model for Pedestrian Motion Prediction , 2018, 2018 European Control Conference (ECC).

[31]  Marc Toussaint,et al.  Prediction of Human Full-Body Movements with Motion Optimization and Recurrent Neural Networks , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[32]  Adrien Treuille,et al.  Continuum crowds , 2006, SIGGRAPH 2006.

[33]  Lionel Ott,et al.  Probabilistic Trajectory Prediction with Structural Constraints , 2021, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[34]  Cyrill Stachniss,et al.  ReFusion: 3D Reconstruction in Dynamic Environments for RGB-D Cameras Exploiting Residuals , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[35]  Minh Vo,et al.  Long-term Human Motion Prediction with Scene Context , 2020, ECCV.

[36]  Yoshihiko Nakamura,et al.  PoseFusion: Dense RGB-D SLAM in Dynamic Human Environments , 2018, ISER.

[37]  Jonathan T. Barron,et al.  A category-level 3-D object dataset: Putting the Kinect to work , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[38]  Roland Siegwart,et al.  Voxblox: Incremental 3D Euclidean Signed Distance Fields for on-board MAV planning , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[39]  Fei Gao,et al.  Robust and Efficient Quadrotor Trajectory Generation for Fast Autonomous Flight , 2019, IEEE Robotics and Automation Letters.

[40]  Wolfgang Merkt,et al.  Predicted Composite Signed-Distance Fields for Real-Time Motion Planning in Dynamic Environments , 2021, ICAPS.

[41]  Dizan Vasquez,et al.  Novel planning-based algorithms for human motion prediction , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[42]  Jongyoul Park,et al.  CenterMask: Real-Time Anchor-Free Instance Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Andrew W. Fitzgibbon,et al.  3D scanning deformable objects with a single RGBD sensor , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Rüdiger Dillmann,et al.  Unified GPU voxel collision detection for mobile manipulation planning , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[45]  Mark H. Overmars,et al.  A Predictive Collision Avoidance Model for Pedestrian Simulation , 2009, MIG.

[46]  Roland Siegwart,et al.  Continuous-time trajectory optimization for online UAV replanning , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[47]  Andrew J. Davison,et al.  A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[48]  Julien Pettré,et al.  Social Ways: Learning Multi-Modal Distributions of Pedestrian Trajectories With GANs , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[49]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[50]  Chonhyon Park,et al.  I-Planner: Intention-aware motion planning using learning-based human motion prediction , 2016, Int. J. Robotics Res..

[51]  John J. Leonard,et al.  Kintinuous: Spatially Extended KinectFusion , 2012, AAAI 2012.

[52]  Vladimir Ivan,et al.  Memory Clustering using Persistent Homology for Multimodality- and Discontinuity-Sensitive Learning of Optimal Control Warm-starts , 2020, ArXiv.

[53]  Marc Toussaint,et al.  MoGaze: A Dataset of Full-Body Motions that Includes Workspace Geometry and Eye-Gaze , 2020, IEEE Robotics and Automation Letters.

[54]  Byron Boots,et al.  Continuous-time Gaussian process motion planning via probabilistic inference , 2017, Int. J. Robotics Res..

[55]  Robert Fitch,et al.  Bayesian intention inference for trajectory prediction with an unknown goal destination , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[56]  Torsten Bertram,et al.  Online trajectory prediction and planning for social robot navigation , 2017, 2017 IEEE International Conference on Advanced Intelligent Mechatronics (AIM).

[57]  Rui Yu,et al.  Direct, Dense, and Deformable: Template-Based Non-rigid 3D Reconstruction from RGB Video , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[58]  Daniel Cremers,et al.  StaticFusion: Background Reconstruction for Dense RGB-D SLAM in Dynamic Environments , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[59]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[60]  Helbing,et al.  Social force model for pedestrian dynamics. , 1995, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[61]  Luc Van Gool,et al.  You'll never walk alone: Modeling social behavior for multi-target tracking , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[62]  Martin Lauer,et al.  Pedestrian Prediction by Planning Using Deep Neural Networks , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[63]  Lourdes Agapito,et al.  MaskFusion: Real-Time Recognition, Tracking and Reconstruction of Multiple Moving Objects , 2018, 2018 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[64]  Silvio Savarese,et al.  Social LSTM: Human Trajectory Prediction in Crowded Spaces , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Achim J. Lilienthal,et al.  THÖR: Human-Robot Navigation Data Collection and Accurate Motion Trajectories Dataset , 2020, IEEE Robotics and Automation Letters.

[66]  Bart Selman,et al.  Unstructured human activity detection from RGBD images , 2011, 2012 IEEE International Conference on Robotics and Automation.

[67]  Stefan Schaal,et al.  STOMP: Stochastic trajectory optimization for motion planning , 2011, 2011 IEEE International Conference on Robotics and Automation.

[68]  Otmar Hilliges,et al.  Deep Pictorial Gaze Estimation , 2018, ECCV.

[69]  Zhi Yan,et al.  Online learning for human classification in 3D LiDAR-based tracking , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[70]  Martial Hebert,et al.  Activity Forecasting , 2012, ECCV.

[71]  Matteo Munaro,et al.  Fast RGB-D people tracking for service robots , 2014, Auton. Robots.

[72]  Chonhyon Park,et al.  ITOMP: Incremental Trajectory Optimization for Real-Time Replanning in Dynamic Environments , 2012, ICAPS.