Active Vision via Extremum Seeking for Robots in Unstructured Environments: Applications in Object Recognition and Manipulation

In this paper, a novel active vision strategy is proposed for optimizing the viewpoint of a robot’s vision sensor for a given success criterion. The strategy is based on extremum seeking control (ESC), which introduces two main advantages: 1) Our approach is model free: It does not require an explicit objective function or any other task model to calculate the gradient direction for viewpoint optimization. This brings new possibilities for the use of active vision in unstructured environments, since a priori knowledge of the surroundings and the target objects is not required. 2) ESC conducts continuous optimization backed up with mechanisms to escape from local maxima. This enables an efficient execution of an active vision task. We demonstrate our approach with two applications in the object recognition and manipulation fields, where the model-free approach brings various benefits: for object recognition, our framework removes the dependence on offline training data for viewpoint optimization, and provides robustness of the system to occlusions and changing lighting conditions. In object manipulation, the model-free approach allows us to increase the success rate of a grasp synthesis algorithm without the need of an object model; the algorithm only uses continuous measurements of the objective value, i.e., the grasp quality. Our experiments show that continuous viewpoint optimization can efficiently increase the data quality for the underlying algorithm, while maintaining the robustness. Note to Practitioners—Vision sensors provide robots flexibility and robustness both in industrial and domestic settings by supplying required data to analyze the surroundings and the state of the task. However, the quality of these data can be very high or poor depending on the viewing angle of the vision sensor. For example, if the robot aims to recognize an object, images taken from certain angles (e.g., feature rich surfaces) can be more descriptive than the others, or if the robot’s goal is to manipulate an object, observing it from a viewpoint that reveals easy-to-grasp “handles” makes the task simpler to execute. The algorithm presented in this paper aims to provide the robot high quality visual data relative to the task at hand by changing vision sensors’ viewpoint. Different from other methods in the literature, our method does not require any task models (therefore, it is model free), and only utilizes a quality value that can be measured from the current viewpoint (e.g., object recognition success rate for the current image). The viewpoint of the sensor is changed continuously for increasing the quality value until the robot is confident enough about the success of the execution. We demonstrate the application of the algorithm in the object recognition and manipulation domains. Nevertheless, it can be applied to many other robotics tasks, where viewing angle of the scene affects the robot’s performance.

[1]  Ashutosh Saxena,et al.  Learning to Grasp Novel Objects Using Vision , 2006, ISER.

[2]  N.J. Killingsworth,et al.  Extremum Seeking Tuning of an Experimental HCCI Engine Combustion Timing Controller , 2007, 2007 American Control Conference.

[3]  Martijn Wisse,et al.  Grasping of unknown objects via curvature maximization using active vision , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  G. Zhang,et al.  An Information Roadmap Method for Robotic Sensor Path Planning , 2009, J. Intell. Robotic Syst..

[5]  David Casasent,et al.  Feature Space Trajectory Methods for Active Computer Vision , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  G. A. Kragten Underactuated hands: Fundamentals, performance analysis and design , 2011 .

[7]  Fuchun Sun,et al.  Active object recognition using hierarchical local-receptive-field-based extreme learning machine , 2018, Memetic Comput..

[8]  Guido C. H. E. de Croon,et al.  Comparing active vision models , 2009, Image Vis. Comput..

[9]  Danica Kragic,et al.  Learning grasping points with shape context , 2010, Robotics Auton. Syst..

[10]  Kartik B. Ariyur,et al.  Real-Time Optimization by Extremum-Seeking Control , 2003 .

[11]  Davide Scaramuzza,et al.  A comparison of volumetric information gain metrics for active 3D object reconstruction , 2017, Autonomous Robots.

[12]  Joachim Denzler,et al.  A Framework for Actively Selecting Viewpoints in Object Recognition , 2009, Int. J. Pattern Recognit. Artif. Intell..

[13]  Matei T. Ciocarlie,et al.  Contact-reactive grasping of objects with partial shape information , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Dorin Comaniciu,et al.  Conditional feature sensitivity: a unifying view on active recognition and feature selection , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[15]  Danica Kragic,et al.  Integrating Active Mobile Robot Object Recognition and SLAM in Natural Environments , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Yuichi Motai,et al.  Hand–Eye Calibration Applied to Viewpoint Selection for Robotic Vision , 2008, IEEE Transactions on Industrial Electronics.

[17]  Chunlei Zhang,et al.  Non-gradient Extremum Seeking Control of Feedback Linearizable Systems with Application to ABS Design , 2006, Proceedings of the 45th IEEE Conference on Decision and Control.

[18]  Antonio Adán,et al.  Active object recognition based on Fourier descriptors clustering , 2008, Pattern Recognit. Lett..

[19]  Miroslav Krstic,et al.  Optimizing bioreactors by extremum seeking , 1999 .

[20]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[21]  Javier R. Movellan,et al.  Deep Q-learning for Active Recognition of GERMS: Baseline performance on a standardized dataset for active learning , 2015, BMVC.

[22]  Antonio Marín-Hernández,et al.  Learning from the Web: Recognition method based on object appearance from Internet images , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[23]  Nicholas R. Gans,et al.  Robots looking for interesting things: Extremum seeking control on saliency maps , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24]  Leslie Pack Kaelbling,et al.  Integrated task and motion planning in belief space , 2013, Int. J. Robotics Res..

[25]  Joachim Denzler,et al.  Information Theoretic Sensor Data Selection for Active Object Recognition and State Estimation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Simone Frintrop,et al.  Attentional Landmarks and Active Gaze Control for Visual SLAM , 2008, IEEE Transactions on Robotics.

[27]  Shengyong Chen,et al.  Active vision in robotic systems: A survey of recent developments , 2011, Int. J. Robotics Res..

[28]  Jannik Fritsch,et al.  A multi-modal object attention system for a mobile robot , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[29]  Marc Toussaint,et al.  Gaussian process implicit surfaces for shape estimation and grasping , 2011, 2011 IEEE International Conference on Robotics and Automation.

[30]  Sergey V. Drakunov,et al.  ABS control using optimum search via sliding modes , 1995, IEEE Trans. Control. Syst. Technol..

[31]  Frank P. Ferrie,et al.  Entropy-based gaze planning , 2001, Image Vis. Comput..

[32]  Jürgen Beyerer,et al.  Bayesian active object recognition via Gaussian process regression , 2012, 2012 15th International Conference on Information Fusion.

[33]  Shahin Sirouspour,et al.  Active multi-camera object recognition in presence of occlusion , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[34]  Pietro Perona,et al.  Learning object categories from Google's image search , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[35]  Danica Kragic,et al.  Vision for robotic object manipulation in domestic settings , 2005, Robotics Auton. Syst..

[36]  Julien Marzat,et al.  Learning Viewpoint Planning in Active Recognition on a Small Sampling Budget: A Kriging Approach , 2010, 2010 Ninth International Conference on Machine Learning and Applications.

[37]  C. Roos,et al.  On the classical logarithmic barrier function method for a class of smooth convex programming problems , 1992 .

[38]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[39]  Philipp Robbel,et al.  Exploiting feature dynamics for active object recognition , 2010, 2010 11th International Conference on Control Automation Robotics & Vision.

[40]  Nicholas R. Gans,et al.  Simplex Guided Extremum Seeking Control With Convergence Detection to Improve Global Performance , 2016, IEEE Transactions on Control Systems Technology.

[41]  Gamini Dissanayake,et al.  Active recognition and pose estimation of household objects in clutter , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[42]  Stefano Caselli,et al.  Perception and Grasping of Object Parts from Active Robot Exploration , 2014, J. Intell. Robotic Syst..

[43]  A. Pinz,et al.  Appearance-based active object recognition q , 2000 .

[44]  Tal Arbel,et al.  A fast discriminant approach to active object recognition and pose estimation , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[45]  Katsunari Shibata,et al.  Active perception and recognition learning system based on Actor-Q architecture , 2002, Systems and Computers in Japan.

[46]  Philippe Martinet,et al.  Biologically-inspired 3D grasp synthesis based on visual exploration , 2008, Auton. Robots.

[47]  Ashutosh Saxena,et al.  Efficient grasping from RGBD images: Learning using a new rectangle representation , 2011, 2011 IEEE International Conference on Robotics and Automation.

[48]  Subhashis Banerjee,et al.  Active recognition through next view planning: a survey , 2004, Pattern Recognit..

[49]  Lucas Paletta,et al.  Active object recognition by view integration and reinforcement learning , 2000, Robotics Auton. Syst..

[50]  Manuela Chessa,et al.  Bio-inspired active vision for obstacle avoidance , 2014, 2014 International Conference on Computer Graphics Theory and Applications (GRAPP).

[51]  Pascal Poupart,et al.  Partially Observable Markov Decision Processes , 2010, Encyclopedia of Machine Learning.

[52]  Luis Enrique Sucar,et al.  View planning for 3D object reconstruction , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[53]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[54]  Youfu Li,et al.  Information entropy-based viewpoint planning for 3-D object reconstruction , 2005, IEEE Transactions on Robotics.

[55]  Robert Eidenberger,et al.  Active perception and scene modeling by planning with probabilistic 6D object poses , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[56]  Stefan Leutenegger,et al.  Pairwise Decomposition of Image Sequences for Active Multi-view Recognition , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Marek Sewer Kopicki,et al.  Active vision for dexterous grasping of novel objects , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[58]  Nico Blodow,et al.  Towards 3D Point cloud based object maps for household environments , 2008, Robotics Auton. Syst..

[59]  Wouter Caarls,et al.  Comparison of extremum seeking control algorithms for robotic applications , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[60]  Jun Li,et al.  Active Recognition and Manipulation for Mobile Robot Bin Picking , 2014, Technology Transfer Experiments from the ECHORD Project.

[61]  S. K. Korovin,et al.  Using sliding modes in static optimization and nonlinear programming , 1974, Autom..

[62]  Marcelo C. M. Teixeira,et al.  Analog neural nonderivative optimizers , 1998, IEEE Trans. Neural Networks.

[63]  Pieter Abbeel,et al.  Active exploration using trajectory optimization for robotic grasping in the presence of occlusions , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[64]  Roland Siegwart,et al.  Cognitive maps for mobile robots - an object based approach , 2007, Robotics Auton. Syst..