A Novel Camera Fusion Method Based on Switching Scheme and Occlusion-Aware Object Detection for Real-Time Robotic Grasping

Real-time vision-based robotic grasping is challenging in clutter. In such scene, the target object should be perceived accurately, where it may be occluded and misrecognized by many distractors including irrelevant objects and the robotic arm. In addition, the limited field of view (FOV) of camera makes it prone for objects to get out of the camera view. We develop a novel camera fusion method of pose estimation based on switching scheme for real-time robotic grasping under hybrid eye-in-hand (EIH)/eye-to-hand (ETH) configurations. The objects are locked based on occlusion-aware object detection to apply switching function for single pose estimation or multiple vision fusion. This method improves the accuracy of pose estimation and robustness of dynamic grasping under occlusion. Experimental results on pose estimation and real-time robotic grasping in clutter verify the effectiveness of the proposed method.

[1]  Rs Roel Pieters,et al.  Visual Servo Control , 2012 .

[2]  Kate Saenko,et al.  Grasp Pose Detection in Point Clouds , 2017, Int. J. Robotics Res..

[3]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[4]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[5]  Zheng H. Zhu,et al.  Autonomous robotic capture of non-cooperative target using visual servoing and motion predictive control , 2014, Auton. Robots.

[6]  H. D. Taghirad,et al.  Robust unscented Kalman filter for visual servoing system , 2011, The 2nd International Conference on Control, Instrumentation and Automation.

[7]  Peter I. Corke,et al.  MATLAB toolboxes: robotics and vision for students and teachers , 2007, IEEE Robotics & Automation Magazine.

[8]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[9]  Joseph Redmon,et al.  Real-time grasp detection using convolutional neural networks , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Jin Ma,et al.  Deep learning for picking point detection in dense cluster , 2017, 2017 11th Asian Control Conference (ASCC).

[11]  Shaojie Shen,et al.  VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator , 2017, IEEE Transactions on Robotics.

[12]  Kuan-Ting Yu,et al.  Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching , 2019, The International Journal of Robotics Research.

[13]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[14]  Thomas Rühr,et al.  Improving Data Efficiency of Self-supervised Learning for Robotic Grasping , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[15]  Farrokh Janabi-Sharifi,et al.  A Robust Vision-Based Sensor Fusion Approach for Real-Time Pose Estimation , 2014, IEEE Transactions on Cybernetics.

[16]  Peter Corke,et al.  Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach , 2018, Robotics: Science and Systems.

[17]  Ying Wang,et al.  A modified image-based visual servo controller with hybrid camera configuration for robust robotic grasping , 2014, Robotics Auton. Syst..

[18]  Roland Siegwart,et al.  Robust visual inertial odometry using a direct EKF-based approach , 2015, IROS 2015.

[19]  Farrokh Janabi-Sharifi,et al.  Virtual Visual Servoing for Multicamera Pose Estimation , 2015, IEEE/ASME Transactions on Mechatronics.

[20]  François Chaumette,et al.  Visual servo control. I. Basic approaches , 2006, IEEE Robotics & Automation Magazine.

[21]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[22]  Mohammed Marey,et al.  A Kalman-Filter-Based Method for Pose Estimation in Visual Servoing , 2010, IEEE Transactions on Robotics.

[23]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[24]  Peter I. Corke,et al.  A tutorial on visual servo control , 1996, IEEE Trans. Robotics Autom..

[25]  Kuan-Ting Yu,et al.  Multi-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Peter Corke,et al.  An Introduction to Inertial and Visual Sensing , 2007, Int. J. Robotics Res..

[27]  D. Marquardt An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .

[28]  Gangqi Dong,et al.  Position-based visual servo control of autonomous robotic manipulators , 2015 .

[29]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[30]  Roland Siegwart,et al.  Fusion of IMU and Vision for Absolute Scale Estimation in Monocular SLAM , 2011, J. Intell. Robotic Syst..

[31]  Douglas Chai,et al.  Review of Deep Learning Methods in Robotic Grasp Detection , 2018, Multimodal Technol. Interact..

[32]  Ian Taylor,et al.  Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[34]  Christopher Kanan,et al.  Robotic grasp detection using deep convolutional neural networks , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[35]  Roland Siegwart,et al.  A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation , 2011, CVPR 2011.

[36]  Shaojie Shen,et al.  Online Temporal Calibration for Monocular Visual-Inertial Systems , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[37]  Sven Behnke,et al.  RGB-D object detection and semantic segmentation for autonomous manipulation in clutter , 2018, Int. J. Robotics Res..

[38]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[39]  Vincenzo Lippiello,et al.  Position-Based Visual Servoing in Industrial Multirobot Cells Using a Hybrid Camera Configuration , 2007, IEEE Transactions on Robotics.