Higher Order Function Networks for View Planning and Multi-View Reconstruction

We consider the problem of planning views for a robot to acquire images of an object for visual inspection and reconstruction. In contrast to offline methods which require a 3D model of the object as input or online methods which rely on only local measurements, our method uses a neural network which encodes shape information for a large number of objects. We build on recent deep learning methods capable of generating a complete 3D reconstruction of an object from a single image. Specifically, in this work, we extend a recent method which uses Higher Order Functions (HOF) to represent the shape of the object. We present a new generalization of this method to incorporate multiple images as input and establish a connection between visibility and reconstruction quality. This relationship forms the foundation of our view planning method where we compute viewpoints to visually cover the output of the multiview HOF network with as few images as possible. Experiments indicate that our method provides a good compromise between online and offline methods: Similar to online methods, our method does not require the true object model as input. In terms of number of views, it is much more efficient. In most cases, its performance is comparable to the optimal offline case even on object classes the network has not been trained on.

[1]  Ruzena Bajcsy,et al.  Occlusions as a Guide for Planning the Next View , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Alonzo Kelly Precision Dilution in Triangulation Based Mobile Robot Position Estimation , 2003 .

[3]  R. Basri,et al.  Direct visibility of point sets , 2007, SIGGRAPH 2007.

[4]  Daniel D. Lee,et al.  Higher-Order Function Networks for Learning Composable 3D Object Representations , 2019, ICLR.

[5]  Mathieu Aubry,et al.  AtlasNet: A Papier-M\^ach\'e Approach to Learning 3D Surface Generation , 2018, CVPR 2018.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Wei Lin,et al.  Sampling-based view planning for 3D visual coverage task with Unmanned Aerial Vehicle , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8]  Howie Choset,et al.  Autonomous exploration via regions of interest , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[9]  Hao Su,et al.  A Point Set Generation Network for 3D Object Reconstruction from a Single Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[11]  Davide Scaramuzza,et al.  A comparison of volumetric information gain metrics for active 3D object reconstruction , 2018, Auton. Robots.

[12]  Franz S. Hover,et al.  Sampling-Based Coverage Path Planning for Inspection of Complex Structures , 2012, ICAPS.

[13]  G. Roth,et al.  View planning for automated three-dimensional object reconstruction and inspection , 2003, CSUR.

[14]  David Haussler,et al.  ɛ-nets and simplex range queries , 1987, Discret. Comput. Geom..

[15]  Glenn H. Tarbox,et al.  Planning for Complete Sensor Coverage in Inspection , 1995, Comput. Vis. Image Underst..

[16]  Héctor H. González-Baños,et al.  Navigation Strategies for Exploring Indoor Environments , 2002, Int. J. Robotics Res..

[17]  Sebastian Thrun,et al.  Probabilistic robotics , 2002, CACM.

[18]  Thomas Brox,et al.  Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Jaime Valls Miró,et al.  Mutual information-based exploration on continuous occupancy maps , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[20]  Mongi A. Abidi,et al.  Best-next-view algorithm for three-dimensional scene reconstruction using range images , 1995, Other Conferences.

[21]  Wolfram Burgard,et al.  Self-supervised 3D Shape and Viewpoint Estimation from Single Images for Robotics , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22]  Wei Liu,et al.  Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images , 2018, ECCV.

[23]  Dieter Fox,et al.  DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Silvio Savarese,et al.  3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction , 2016, ECCV.

[25]  Sampath Kannan,et al.  Sampling based sensor-network deployment , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[26]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[27]  Daewon Lee,et al.  Pixels to Plans: Learning Non-Prehensile Manipulation by Imitating a Planner , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[28]  Daniela Rus,et al.  On mutual information-based control of range sensing robots for mapping applications , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[29]  John K. Tsotsos,et al.  Revisiting active perception , 2016, Autonomous Robots.