论文信息 - 3D object perception and perceptual learning in the RACE project

3D object perception and perceptual learning in the RACE project

This paper describes a 3D object perception and perceptual learning system developed for a complex artificial cognitive agent working in a restaurant scenario. This system, developed within the scope of the European project RACE, integrates detection, tracking, learning and recognition of tabletop objects. Interaction capabilities were also developed to enable a human user to take the role of instructor and teach new object categories. Thus, the system learns in an incremental and open-ended way from user-mediated experiences. Based on the analysis of memory requirements for storing both semantic and perceptual data, a dual memory approach, comprising a semantic memory and a perceptual memory, was adopted. The perceptual memory is the central data structure of the described perception and learning system. The goal of this paper is twofold: on one hand, we provide a thorough description of the developed system, starting with motivations, cognitive considerations and architecture design, then providing details on the developed modules, and finally presenting a detailed evaluation of the system; on the other hand, we emphasize the crucial importance of the Point Cloud Library (PCL) for developing such system.11This paper is a revised and extended version of Oliveira et?al. (2014). We describe an object perception and perceptual learning system.The system is able to detect, track and recognize tabletop objects.The system learns novel object categories in an open-ended fashion.The Point Cloud Library is used in nearly all modules of the system.The system was developed and used in the European project RACE.

[1] Gi Hyun Lim,et al. A perceptual memory system for grounding semantic representations in intelligent service robots , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2] Radu Bogdan Rusu,et al. 3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[3] Alessandro Saffiotti,et al. An introduction to the anchoring problem , 2003, Robotics Auton. Syst..

[4] Zoltan-Csaba Marton,et al. Tutorial: Point Cloud Library: Three-Dimensional Object Recognition and 6 DOF Pose Estimation , 2012, IEEE Robotics & Automation Magazine.

[5] E. Tulving. Concepts of human memory. , 1991 .

[6] E. Tulving. Episodic Memory and Autonoesis: Uniquely Human? , 2005 .

[7] Brian P. Gerkey,et al. Sharing Software with ROS [ROS Topics] , 2010, ICRA 2010.

[8] Gerald Steinbauer,et al. An integrated model-based diagnosis and repair architecture for ROS-based robot systems , 2013, 2013 IEEE International Conference on Robotics and Automation.

[9] Dieter Fox,et al. A large-scale hierarchical multi-view RGB-D object dataset , 2011, 2011 IEEE International Conference on Robotics and Automation.

[10] Arie E. Kaufman,et al. Fundamentals of Surface Voxelization , 1995, CVGIP Graph. Model. Image Process..

[11] Il Hong Suh,et al. Ontology-Based Unified Robot Knowledge for Service Robots in Indoor Environments , 2011, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[12] L. Seabra Lopes,et al. How many words can my robot learn?: An approach and experiments with one-class learning , 2007 .

[13] Jonathan Evans. Dual-processing accounts of reasoning, judgment, and social cognition. , 2008, Annual review of psychology.

[14] Nico Blodow,et al. Fast Point Feature Histograms (FPFH) for 3D registration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[15] Anthony G. Cohn,et al. Grounding Language in Perception for Scene Conceptualization in Autonomous Robots , 2014, AAAI Spring Symposia.

[16] Dana S. Nau,et al. SHOP2: An HTN Planning System , 2003, J. Artif. Intell. Res..

[17] Gi Hyun Lim,et al. Interactive Open-Ended Learning for 3D Object Recognition: An Approach and Experiments , 2015, J. Intell. Robotic Syst..

[18] Matteo Munaro,et al. A Software Architecture for RGB-D People Tracking Based on ROS Framework for a Mobile Robot , 2013, Frontiers of Intelligent Autonomous Systems.

[19] Sergio Escalera,et al. Multi-modal user identification and object recognition surveillance system , 2013, Pattern Recognit. Lett..

[20] Radu Bogdan Rusu,et al. Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments , 2010, KI - Künstliche Intelligenz.

[21] Patrick Doherty,et al. A stream-based hierarchical anchoring framework , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22] L. Barsalou,et al. Whither structured representation? , 1999, Behavioral and Brain Sciences.

[23] Andrew E. Johnson,et al. Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[24] Brian P. Gerkey,et al. Sharing software with ROS , 2010 .

[25] Gi Hyun Lim,et al. Interactive teaching and experience extraction for learning about objects and robot activities , 2014, The 23rd IEEE International Symposium on Robot and Human Interactive Communication.

[26] Luís Seabra Lopes,et al. Semisentient robots: routes to integrated intelligence , 2001 .

[27] Nico Blodow,et al. Close-range scene segmentation and reconstruction of 3D point cloud maps for mobile manipulation in domestic environments , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[28] Tony Belpaeme,et al. A review of long-term memory in natural and synthetic systems , 2012, Adapt. Behav..

[29] Jianwei Zhang,et al. The RACE Project , 2014, KI - Künstliche Intelligenz.

[30] David P. Dobkin,et al. The quickhull algorithm for convex hulls , 1996, TOMS.

[31] Jon Louis Bentley,et al. Multidimensional binary search trees used for associative searching , 1975, CACM.

[32] Luís Seabra Lopes,et al. Using spoken words to guide open-ended category formation , 2011, Cognitive Processing.

[33] Luís Seabra Lopes,et al. Open-ended category learning for language acquisition , 2008, Connect. Sci..

[34] Gi Hyun Lim,et al. Concurrent learning of visual codebooks and object categories in open-ended domains , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[35] Mark Steedman,et al. Object-Action Complexes: Grounded abstractions of sensory-motor processes , 2011, Robotics Auton. Syst..

[36] Jaspreet Kaur,et al. A Review Of Non Relational Databases, Their Types, Advantages And Disadvantages , 2013 .

[37] Robert C. Bolles,et al. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[38] Horst-Michael Groß,et al. A life-long learning vector quantization approach for interactive learning of multiple categories , 2012, Neural Networks.

[39] Luís Seabra Lopes,et al. Guest Editors' Introduction: Semisentient Robots - Routes to Integrated Intelligence , 2001, IEEE Intell. Syst..

[40] Nancy M. Amato,et al. An NC parallel 3D convex hull algorithm , 1993, SCG '93.

[41] Armando J. Pinho,et al. Gathering and Conceptualizing Plan-Based Robot Activity Experiences , 2014, IAS.

[42] Wei Yi,et al. Principal component analysis in application to object orientation , 2000 .

[43] Piyush Kumar,et al. Fast construction of k-nearest neighbor graphs for point clouds , 2010, IEEE Transactions on Visualization and Computer Graphics.

[44] Aamir S. Malik,et al. 3D Tracking using particle filters , 2011, 2011 IEEE International Instrumentation and Measurement Technology Conference.