Training Deep Neural Networks for Visual Servoing

We present a deep neural network-based method to perform high-precision, robust and real-time 6 DOF positioning tasks by visual servoing. A convolutional neural network is fine-tuned to estimate the relative pose between the current and desired images and a pose-based visual servoing control law is considered to reach the desired pose. The paper describes how to efficiently and automatically create a dataset used to train the network. We show that this enables the robust handling of various perturbations (occlusions and lighting variations). We then propose the training of a scene-agnostic network by feeding in both the desired and current images into a deep network. The method is validated on a 6 DOF robot.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Vincent Lepetit,et al.  BB8: A Scalable, Accurate, Robust to Partial Occlusion Method for Predicting the 3D Poses of Challenging Objects without Using Depth , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[3]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Jürgen Leitner,et al.  Visual Servoing from Deep Neural Networks , 2017, RSS 2017.

[5]  Éric Marchand,et al.  Histograms-Based Visual Servoing , 2017, IEEE Robotics and Automation Letters.

[6]  Seth Hutchinson,et al.  Visual Servo Control Part I: Basic Approaches , 2006 .

[7]  K. Madhava Krishna,et al.  Exploring convolutional networks for end-to-end visual servoing , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[8]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Trevor Darrell,et al.  Recognizing Image Style , 2013, BMVC.

[10]  Guillaume Caron,et al.  Photometric Gaussian mixtures based visual servoing , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Christophe Collewet,et al.  Photometric Visual Servoing , 2011, IEEE Transactions on Robotics.

[12]  Roberto Cipolla,et al.  PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Gregory D. Hager,et al.  Kernel-based visual servoing , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Koichiro Deguchi,et al.  A Direct Interpretation of Dynamic Images with Camera and Object Motions for Vision Guided Robot Control , 2000, International Journal of Computer Vision.

[15]  Randal C. Nelson,et al.  On-line Estimation of Visual-Motor Models using Active Vision , 1996 .

[16]  Éric Marchand,et al.  Mutual Information-Based Visual Servoing , 2011, IEEE Transactions on Robotics.

[17]  Ezio Malis,et al.  Improving vision-based control using efficient second-order minimization techniques , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[18]  Peter I. Corke,et al.  A tutorial on visual servo control , 1996, IEEE Trans. Robotics Autom..

[19]  Sergey Levine,et al.  Learning Visual Servoing with Deep Features and Fitted Q-Iteration , 2017, ICLR.

[20]  Éric Marchand,et al.  Photometric moments: New promising candidates for visual servoing , 2013, 2013 IEEE International Conference on Robotics and Automation.

[21]  Peter I. Corke,et al.  Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control , 2015, ICRA 2015.

[22]  Carme Torras,et al.  Vision-based robot positioning using neural networks , 1996, Image Vis. Comput..

[23]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[24]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[25]  Tomasz Malisiewicz,et al.  Deep Image Homography Estimation , 2016, ArXiv.

[26]  Éric Marchand,et al.  Particle filter-based direct visual servoing , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[27]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[28]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[29]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.