Object recognition using local invariant features for robotic applications: A survey

The main goal of this survey is to present a complete analysis of object recognition methods based on local invariant features from a robotics perspective; a summary which can be used by developers of robot vision applications in the selection and development of object recognition systems. The survey includes a brief description of the main approaches reported in the literature, with more specific analyses of local interest point computation methods, local descriptor computation and matching methods, and geometric verification methods. Different methods are analyzed by considering the main requirements of robotics applications, such as real-time operation with limited on-board computational resources, and constrained observational conditions derived from the robot geometry (e.g. limited camera resolution). In addition, various object recognition systems are evaluated in a service-robot domestic environment, where the final task to be performed by a service robot is the manipulation of objects. It can be concluded from the results reported that (i) the most suitable keypoint detectors are ORB, BRISK, Fast Hessian, and DoG, (ii) the most suitable descriptors are ORB, BRISK, SIFT, and SURF, (iii) the final performance of object recognition systems using local invariant features under real-world conditions depends strongly on the geometric verification methods being used, and (iv) the best performing object recognition systems are built using ORB-ORB and DoG-SIFT keypoint-descriptor combinations. ORB-ORB based systems are faster, while DoG-SIFT are more robust to real-world conditions. A complete analysis of object recognition methods based on local invariant features from a robotics perspective is given.A brief description of the main approaches reported in the literature is included.Methods are analyzed by considering the main requirements of robotics applications.Best performing object recognition systems are built using ORB-ORB and DoG-SIFT keypoint-descriptor combinations.

[1]  Tardi Tjahjadi,et al.  Clique descriptor of affine invariant regions for robust wide baseline image matching , 2010, Pattern Recognit..

[2]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[3]  Tom Drummond,et al.  Fusing points and lines for high performance tracking , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[4]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[5]  Bastian Leibe,et al.  Visual Object Recognition , 2011, Visual Object Recognition.

[6]  Javier Ruiz-del-Solar,et al.  Bender: a general-purpose social robot with human-robot interaction capabilities , 2013, HRI 2013.

[7]  Rita Cucchiara,et al.  Real-time object detection and localization with SIFT-based clustering , 2012, Image Vis. Comput..

[8]  Emanuele Menegatti,et al.  Recognition of Smart Objects by a Mobile Robot using SIFT-based Image Recognition and Wireless Communication , 2009, ECMR.

[9]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Chao Wang,et al.  Improved SIFT-Features Matching for Object Recognition , 2008, BCS Int. Acad. Conf..

[11]  Manuela M. Veloso,et al.  Detection and Localization of Multiple Objects , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[12]  David Suter,et al.  Robot manipulation grasping of recognized objects for assistive technology support using stereo vision. , 2008, ICRA 2008.

[13]  Hector Perez-Meana,et al.  Object Detection Using SURF and Superpixels , 2013 .

[14]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[15]  Peter K. Allen,et al.  Recognition of deformable object category and pose , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[17]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[18]  Danica Kragic,et al.  Vision for robotic object manipulation in domestic settings , 2005, Robotics Auton. Syst..

[19]  Xudong Jiang,et al.  Interest point detection using rank order LoG filter , 2013, Pattern Recognit..

[20]  Javier Ruiz-del-Solar,et al.  Object Recognition for Manipulation Tasks in Real Domestic Settings: A Comparative Study , 2014, RoboCup.

[21]  Hans P. Moravec Obstacle avoidance and navigation in the real world by a seeing robot rover , 1980 .

[22]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[23]  Javier Ruiz-del-Solar,et al.  Robot Head Pose Detection and Gaze Direction Determination Using Local Invariant Features , 2009, Adv. Robotics.

[24]  Mohammed Bennamoun,et al.  Unsupervised segmentation of unknown objects in complex environments , 2016, Auton. Robots.

[25]  Xiaokang Yang,et al.  HEASK: Robust homography estimation based on appearance similarity and keypoint correspondences , 2014, Pattern Recognit..

[26]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[27]  Huimin Lu,et al.  Two novel real-time local visual features for omnidirectional vision , 2010, Pattern Recognit..

[28]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Dietrich Paulus,et al.  Object class and instance recognition on rgb-d data , 2013, Other Conferences.

[30]  Zhanyi Hu,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1 Rotationally Invariant Descript , 2011 .

[31]  Stephen M. Smith,et al.  SUSAN—A New Approach to Low Level Image Processing , 1997, International Journal of Computer Vision.

[32]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[34]  Dietrich Paulus,et al.  An Evaluation of Open Source SURF Implementations , 2010, RoboCup.

[35]  Dana H. Ballard,et al.  Generalizing the Hough transform to detect arbitrary shapes , 1981, Pattern Recognit..

[36]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[37]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[39]  Axel Pinz,et al.  Active Object Categorization on a Humanoid Robot , 2011, VISAPP.

[40]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[41]  Geoffrey A. Hollinger,et al.  HERB: a home exploring robotic butler , 2010, Auton. Robots.

[42]  Michel Devy,et al.  Textured Object Recognition: Balancing Model Robustness and Complexity , 2015, CAIP.

[43]  Paul Beaudet,et al.  Rotationally invariant image operators , 1978 .

[44]  Bin Fan,et al.  Local Intensity Order Pattern for feature description , 2011, 2011 International Conference on Computer Vision.

[45]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[46]  Javier Ruiz-del-Solar,et al.  A Fast Probabilistic Model for Hypothesis Rejection in SIFT-Based Object Recognition , 2006, CIARP.

[47]  Aly A. Farag,et al.  CSIFT: A SIFT Descriptor with Color Invariant Characteristics , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[48]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[49]  Myo-Taeg Lim,et al.  MDGHM-SURF: A robust local image descriptor based on modified discrete Gaussian-Hermite moment , 2015, Pattern Recognit..

[50]  Siddhartha S. Srinivasa,et al.  Object recognition and full pose registration from a single image for robotic manipulation , 2009, 2009 IEEE International Conference on Robotics and Automation.

[51]  Koen E. A. van de Sande,et al.  Fisher and VLAD with FLAIR , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Jean-Michel Morel,et al.  ASIFT: A New Framework for Fully Affine Invariant Image Comparison , 2009, SIAM J. Imaging Sci..

[53]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Krystian Mikolajczyk,et al.  Evaluation of local detectors and descriptors for fast feature matching , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[55]  Mohammed Bennamoun,et al.  Efficient RGB-D object categorization using cascaded ensembles of randomized decision trees , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[56]  Antonios Gasteratos,et al.  Simultaneous Visual Object Recognition and Position Estimation Using SIFT , 2009, ICIRA.

[57]  John J. Leonard,et al.  Monocular SLAM Supported Object Recognition , 2015, Robotics: Science and Systems.

[58]  Jinguo Liu,et al.  Using an Improved SIFT Algorithm and Fuzzy Closed-Loop Control Strategy for Object Recognition in Cluttered Scenes , 2015, PloS one.

[59]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[60]  Tom Drummond,et al.  Real-Time Video Annotations for Augmented Reality , 2005, ISVC.

[61]  Danica Kragic,et al.  “Robot bring me something to drink from”: object representation for transferring task specific grasps , 2013, ICRA 2013.

[62]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[63]  Tsuhan Chen,et al.  Robotic Object Detection: Learning to Improve the Classifiers Using Sparse Graphs for Path Planning , 2011, IJCAI.

[64]  Cordelia Schmid,et al.  Local Grayvalue Invariants for Image Retrieval , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[65]  Yuexing Han,et al.  Recognize objects with three kinds of information in landmarks , 2013, Pattern Recognit..

[66]  Tamim Asfour,et al.  Combining Harris interest points and the SIFT descriptor for fast scale-invariant object recognition , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.