论文信息 - Fast self-supervised on-line training for object recognition specifically for robotic applications

Fast self-supervised on-line training for object recognition specifically for robotic applications

Today most recognition pipelines are trained at an off-line stage, providing systems with pre-segmented images and predefined objects, or at an on-line stage, which requires a human supervisor to tediously control the learning. Self-Supervised on-line training of recognition pipelines without human intervention is a highly desirable goal, as it allows systems to learn unknown, environment specific objects on-the-fly. We propose a fast and automatic system, which can extract and learn unknown objects with minimal human intervention by employing a two-level pipeline combining the advantages of RGB-D sensors for object extraction and high-resolution cameras for object recognition. Furthermore, we significantly improve recognition results with local features by implementing a novel keypoint orientation scheme, which leads to highly invariant but discriminative object signatures. Using only one image per object for training, our system is able to achieve a recognition rate of 79% for 18 objects, benchmarked on 42 scenes with random poses, scales and occlusion, while only taking 7 seconds for the training. Additionally, we evaluate our orientation scheme on the state-of-the-art 56-object SDU-dataset boosting accuracy for one training view per object by +37% to 78% and peaking at a performance of 98% for 11 training views.

[1] Vladimir Vapnik,et al. Statistical learning theory , 1998 .

[2] Francesca Odone,et al. Histogram intersection kernel for image classification , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[3] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[4] Gabriela Csurka,et al. Visual categorization with bags of keypoints , 2002, eccv 2004.

[5] Danica Kragic,et al. Integrating Active Mobile Robot Object Recognition and SLAM in Natural Environments , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6] Sos S. Agaian,et al. Multi-Class Classification Averaging Fusion for Detecting Steganography , 2007, 2007 IEEE International Conference on System of Systems Engineering.

[7] Andrew Zisserman,et al. Representing shape with a spatial pyramid kernel , 2007, CIVR '07.

[8] Andrew Zisserman,et al. Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9] Jessica K. Hodgins,et al. Aligned Cluster Analysis for temporal segmentation of human motion , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[10] Luc Van Gool,et al. Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[11] Sebastian Nowozin,et al. On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[12] Xiaohua Hu,et al. Exploiting Wikipedia as external knowledge for document clustering , 2009, KDD.

[13] Koen E. A. van de Sande,et al. Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Vincent Lepetit,et al. BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[15] Thomas S. Huang,et al. Image Classification Using Super-Vector Coding of Local Image Descriptors , 2010, ECCV.

[16] Motoaki Kawanabe,et al. A Hybrid Supervised-Unsupervised Vocabulary Generation Algorithm for Visual Concept Recognition , 2010, ACCV.

[17] Tamim Asfour,et al. Autonomous acquisition of visual multi-view object representations for object recognition on a humanoid robot , 2010, 2010 IEEE International Conference on Robotics and Automation.

[18] Dieter Fox,et al. A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[19] Pejman Iravani,et al. Visual object classification by robots, using on-line, self-supervised learning , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[20] Richard Szeliski,et al. Computer Vision - Algorithms and Applications , 2011, Texts in Computer Science.

[21] Jun Morimoto,et al. Segmentation and learning of unknown objects through physical interaction , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.

[22] Dieter Fox,et al. Depth kernel descriptors for object recognition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23] Luc Van Gool,et al. Functional categorization of objects using real-time markerless motion capture , 2011, CVPR 2011.

[24] Nathan Silberman,et al. Indoor scene segmentation using a structured light sensor , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[25] Kristen Grauman,et al. Efficient region search for object detection , 2011, CVPR 2011.

[26] Radu Bogdan Rusu,et al. 3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[27] Rüdiger Dillmann,et al. The KIT object models database: An object model database for object recognition, localization and manipulation in service robotics , 2012, Int. J. Robotics Res..

[28] Dieter Fox,et al. Detection-based object labeling in 3D scenes , 2012, 2012 IEEE International Conference on Robotics and Automation.

[29] Norbert Krüger,et al. Multi-view object recognition using view-point invariant shape relations and appearance information , 2013, 2013 IEEE International Conference on Robotics and Automation.