3D Move to See: Multi-perspective visual servoing for improving object views with semantic segmentation

In this paper, we present a new approach to visual servoing for robotics, referred to as 3D Move to See (3DMTS), based on the principle of finding the next best view using a 3D camera array and a robotic manipulator to obtain multiple samples of the scene from different perspectives. The method uses semantic vision and an objective function applied to each perspective to sample a gradient representing the direction of the next best view. The method is demonstrated within simulation and on a real robotic platform containing a custom 3D camera array for the challenging scenario of robotic harvesting in a highly occluded and unstructured environment. It was shown on a real robotic platform that by moving the end effector using the gradient of an objective function leads to a locally optimal view of the object of interest, even amongst occlusions. The overall performance of the 3DMTS method obtained a mean increase in target size by 29.3% compared to a baseline method using a single RGB-D camera, which obtained 9.17%. The results demonstrate qualitatively and quantitatively that the 3DMTS method performed better in most scenarios, and yielded three times the target size compared to the baseline method. The increased target size in the final view will improve the detection of key features of the object of interest for further manipulation, such as grasping and harvesting.

[1]  Yoav Y. Schechner,et al.  The Next Best Underwater View , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Eric Claesen,et al.  Autonomous Fruit Picking Machine: A Robotic Apple Harvester , 2007, FSR.

[3]  Heon Hwang,et al.  Strawberry Harvesting Robot for Bench-type Cultivation , 2012 .

[4]  Siddhartha S. Mehta,et al.  Vision-based control of robotic manipulator for citrus harvesting , 2014 .

[5]  Lei Zhang,et al.  A robot visual servo-based approach to the determination of next best views , 2015, 2015 IEEE International Conference on Mechatronics and Automation (ICMA).

[6]  Christophe Collewet,et al.  Photometric Visual Servoing , 2011, IEEE Transactions on Robotics.

[7]  Ruzena Bajcsy,et al.  Occlusions as a Guide for Planning the Next View , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Tristan Perez,et al.  Peduncle Detection of Sweet Pepper for Autonomous Crop Harvesting—Combined Color and 3-D Information , 2017, IEEE Robotics and Automation Letters.

[9]  Peter I. Corke,et al.  A tutorial on visual servo control , 1996, IEEE Trans. Robotics Autom..

[10]  Tristan Perez,et al.  Autonomous Sweet Pepper Harvesting for Protected Cropping Systems , 2017, IEEE Robotics and Automation Letters.

[11]  François Chaumette,et al.  Multi-cameras visual servoing , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[12]  Roger W. Brockett,et al.  Kinematic Dexterity of Robotic Mechanisms , 1994, Int. J. Robotics Res..

[13]  Surya P. N. Singh,et al.  V-REP: A versatile and scalable robot simulation framework , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Dorian Tsai,et al.  Autonomous vision-based tethered-assisted rover docking , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[15]  Shengyong Chen,et al.  Active vision in robotic systems: A survey of recent developments , 2011, Int. J. Robotics Res..

[16]  Peter I. Corke,et al.  Image-Based Visual Servoing With Light Field Cameras , 2017, IEEE Robotics and Automation Letters.

[17]  Tristan Perez,et al.  Sweet pepper pose detection and grasping for automated crop harvesting , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Seth Hutchinson,et al.  Visual Servo Control Part I: Basic Approaches , 2006 .

[19]  Éric Marchand,et al.  Direct visual servoing based on multiple intensity histograms , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Robert E. Mahony,et al.  Image-Based Visual Servo Control of the Translation Kinematics of a Quadrotor Aerial Vehicle , 2009, IEEE Transactions on Robotics.

[21]  Yoshua Bengio,et al.  The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).