VUNet: Dynamic Scene View Synthesis for Traversability Estimation Using an RGB Camera

We present VUNet, a novel view(VU) synthesis method for mobile robots in dynamic environments, and its application to the estimation of future traversability. Our method predicts future images for given virtual robot velocity commands using only RGB images at previous and current time steps. The future images result from applying two types of image changes to the previous and current images: first, changes caused by different camera pose. Second, changes due to the motion of the dynamic obstacles. We learn to predict these two types of changes disjointly using two novel network architectures, SNet and DNet. We combine SNet and DNet to synthesize future images that we pass to our previously presented method GONet [N. Hirose, A. Sadeghian, M. Vazquez, P. Goebel, and S. Savarese, “Gonet: A semi-supervised deep learning approach for traversability estimation,” in Proc. IEEE International Conference on Intelligent Robots and Systems, 2018, pp. 3044–3051] to estimate the traversable areas around the robot. Our quantitative and qualitative evaluation indicate that our approach for view synthesis predicts accurate future images in both static and dynamic environments. We also show that these virtual images can be used to estimate future traversability correctly. We apply our view synthesis-based traversability estimation method to two applications for assisted teleoperation.

[1]  Alexey Dosovitskiy,et al.  End-to-End Driving Via Conditional Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Vladlen Koltun,et al.  Learning to Act by Predicting the Future , 2016, ICLR.

[3]  Thomas Brox,et al.  Multi-view 3D Models from Single Images with a Convolutional Network , 2015, ECCV.

[4]  Sergey Levine,et al.  Stochastic Variational Video Prediction , 2017, ICLR.

[5]  Rahul Sukthankar,et al.  Cognitive Mapping and Planning for Visual Navigation , 2017, International Journal of Computer Vision.

[6]  Sanjiv Singh,et al.  Obstacle detection using adaptive color segmentation and color stereo homography , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[7]  Silvio Savarese,et al.  Deep View Morphing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Silvio Savarese,et al.  GONet: A Semi-Supervised Deep Learning Approach For Traversability Estimation , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9]  Thomas Brox,et al.  Learning to generate chairs with convolutional neural networks , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Roland Siegwart,et al.  From perception to decision: A data-driven approach to end-to-end motion planning for autonomous ground robots , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[12]  Ye-Hoon Kim,et al.  End-to-end deep learning for autonomous navigation of mobile robot , 2018, 2018 IEEE International Conference on Consumer Electronics (ICCE).

[13]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[14]  Razvan Pascanu,et al.  Imagination-Augmented Agents for Deep Reinforcement Learning , 2017, NIPS.

[15]  Noriaki Hirose,et al.  Personal robot assisting transportation to support active human life — Posture stabilization based on feedback compensation of lateral acceleration , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Steven M. Seitz,et al.  View morphing , 1996, SIGGRAPH.

[17]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[18]  Sergey Levine,et al.  Self-Supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Jitendra Malik,et al.  Zero-Shot Visual Imitation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[20]  John Flynn,et al.  Deep Stereo: Learning to Predict New Views from the World's Imagery , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Oussama Khatib,et al.  A depth space approach to human-robot collision avoidance , 2012, 2012 IEEE International Conference on Robotics and Automation.

[22]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Seydou SOUMARE,et al.  Real-Time Obstacle Avoidance by an Autonomous Mobile Robot using an Active Vision Sensor and a Vertically Emitted Laser Slit , 2002 .

[24]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[25]  Rob Fergus,et al.  Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[26]  A. Berg,et al.  Transformation-Grounded Image Generation Network for Novel 3 D View Synthesis – Supplementary Material , 2017 .

[27]  Michael I. Jordan,et al.  Forward dynamic models in human motor control: Psychophysical evidence , 1994, NIPS.

[28]  Antonio Torralba,et al.  Generating Videos with Scene Dynamics , 2016, NIPS.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Yoichi Hori,et al.  Integrated Motion Control of a Wheelchair in the Longitudinal, Lateral, and Pitch Directions , 2008, IEEE Transactions on Industrial Electronics.

[31]  Kenta Oono,et al.  Chainer : a Next-Generation Open Source Framework for Deep Learning , 2015 .

[32]  Manuela M. Veloso,et al.  Visual sonar: fast obstacle avoidance using monocular vision , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[33]  Stephanie Rosenthal,et al.  CoBots: Robust Symbiotic Autonomous Mobile Service Robots , 2015, IJCAI.

[34]  Wolfram Burgard,et al.  The dynamic window approach to collision avoidance , 1997, IEEE Robotics Autom. Mag..

[35]  Nikolai Smolyanskiy,et al.  Toward low-flying autonomous MAV trail navigation using deep neural networks for environmental awareness , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[36]  Silvio Savarese,et al.  SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Xin Zhang,et al.  End to End Learning for Self-Driving Cars , 2016, ArXiv.

[38]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[39]  Richard Szeliski,et al.  Layered depth images , 1998, SIGGRAPH.

[40]  George Drettakis,et al.  Scalable inside-out image-based rendering , 2016, ACM Trans. Graph..

[41]  Shenghua Gao,et al.  Future Frame Prediction for Anomaly Detection - A New Baseline , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Wolfram Burgard,et al.  Traversability analysis for mobile robots in outdoor environments: A semi-supervised learning approach based on 3D-lidar data , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[43]  Ole Winther,et al.  Autoencoding beyond pixels using a learned similarity metric , 2015, ICML.

[44]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[45]  Illah R. Nourbakhsh,et al.  Appearance-Based Obstacle Detection with Monocular Color Vision , 2000, AAAI/IAAI.

[46]  Andrea Cherubini,et al.  Avoiding moving obstacles during visual navigation , 2013, 2013 IEEE International Conference on Robotics and Automation.

[47]  Joshua B. Tenenbaum,et al.  Deep Convolutional Inverse Graphics Network , 2015, NIPS.

[48]  James M. Rehg,et al.  Traversability classification for UGV navigation: a comparison of patch and superpixel representations , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[49]  Charles Richter,et al.  Safe Visual Navigation via Deep Learning and Novelty Detection , 2017, Robotics: Science and Systems.

[50]  Cyrill Stachniss,et al.  Efficient traversability analysis for mobile robots using the Kinect sensor , 2013, 2013 European Conference on Mobile Robots.

[51]  Sergey Levine,et al.  Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics , 2014, NIPS.

[52]  Jitendra Malik,et al.  View Synthesis by Appearance Flow , 2016, ECCV.

[53]  Jitendra Malik,et al.  Gibson Env: Real-World Perception for Embodied Agents , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[54]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.