Model-Based 3D Object Recognition in RGB-D Images

A computational framework for 3D object recognition in RGB-D images is presented. The focus is on computer vision applications in indoor autonomous robotics, where objects need to be recognized either for the purpose of being grasped and manipulated by the robot, or where the entire scene must be recognized to allow high-level cognitive tasks to be performed. The framework integrates solutions for generic (i.e. type-based) object representation (e.g. semantic networks), trainable transformations between abstraction levels (e.g. by neural networks), reasoning under uncertain and partial data (e.g. Dynamic Bayesian Networks, Fuzzy Logic), optimized model-to-data matching (e.g. constraint optimization problems) and efficient search strategies (switching between data- and model-driven inference steps). The computational implementation of the object model and the object recognition strategy is presented in more details. Testing scenarios deal with the recognition of cups and bottles or household furniture. Conducted experiments and the chosen applications confirmed, that this approach is valid and may easily be adapted to multiple scenarios.

[1]  Tomasz Kornuta,et al.  Utilization of Colour in ICP-based Point Cloud Registration , 2015, CORES.

[2]  Larry S. Davis,et al.  Hypothesis integration in image understanding systems , 1985, Comput. Vis. Graph. Image Process..

[3]  David G. Lowe,et al.  Three-Dimensional Object Recognition from Single Two-Dimensional Images , 1987, Artif. Intell..

[4]  David Marr,et al.  VISION A Computational Investigation into the Human Representation and Processing of Visual Information , 2009 .

[5]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[6]  Włodzimierz Kasprzak,et al.  A linguistic approach to 3-D object recognition , 1987, Comput. Graph..

[7]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[8]  Dejan Pangercic,et al.  Fast and Robust Object Detection in Household Environments Using Vocabulary Trees with SIFT Descriptors , 2011, IROS 2011.

[9]  Sven Behnke,et al.  Hierarchical Neural Networks for Image Interpretation , 2003, Lecture Notes in Computer Science.

[10]  Franc Solina,et al.  Segmentation and Recovery of Superquadrics , 2000, Computational Imaging and Vision.

[11]  Heinrich Niemann,et al.  ERNEST: A Semantic Network System for Pattern Understanding , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Sanja Fidler,et al.  Holistic Scene Understanding for 3D Object Detection with RGBD Cameras , 2013, 2013 IEEE International Conference on Computer Vision.

[13]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[14]  Cezary Zielinski,et al.  Variable structure robot control systems: The RAPP approach , 2017, Robotics Auton. Syst..

[15]  Kurt Konolige,et al.  Change Their Perception: RGB-D for 3-D Modeling and Recognition , 2013, IEEE Robotics & Automation Magazine.

[16]  Tomasz Kornuta,et al.  Efficient generation of 3D surfel maps using RGB–D sensors , 2016, Int. J. Appl. Math. Comput. Sci..

[17]  Tully Foote,et al.  tf: The transform library , 2013, 2013 IEEE Conference on Technologies for Practical Robot Applications (TePRA).

[18]  Siddhartha S. Srinivasa,et al.  The MOPED framework: Object recognition and pose estimation for manipulation , 2011, Int. J. Robotics Res..

[19]  Dieter Fox,et al.  RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments , 2012, Int. J. Robotics Res..

[20]  Vincent Lepetit,et al.  Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes , 2011, 2011 International Conference on Computer Vision.

[21]  Tomasz Kornuta,et al.  WUT Visual Perception Dataset: A Dataset for Registration and Recognition of Objects , 2016, AUTOMATION.

[22]  Maciej Stefanczyk,et al.  Multimodal Segmentation of Dense Depth Maps and Associated Color Information , 2012, ICCVG.

[23]  Andrew Y. Ng,et al.  Convolutional-Recursive Deep Learning for 3D Object Classification , 2012, NIPS.

[24]  Cezary Zielinski,et al.  Merging Robotics and AAL Ontologies: The RAPP Methodology , 2015, Progress in Automation, Robotics and Measuring Techniques.

[25]  Cezary Zielinski,et al.  A Virtual Receptor in a Robot Control Framework , 2014, Recent Advances in Automation, Robotics and Measuring Techniques.

[26]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27]  Maciej Stefanczyk,et al.  Detection and Recognition of Compound 3D Models by Hypothesis Generation , 2016, AUTOMATION.

[28]  Wlodzimierz Kasprzak,et al.  Integration of different computational models in a computer vision framework , 2010, 2010 International Conference on Computer Information Systems and Industrial Management Applications (CISIM).

[29]  Peter K. Allen,et al.  Graspit! A versatile simulator for robotic grasping , 2004, IEEE Robotics & Automation Magazine.

[30]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Jean Ponce,et al.  Computer Vision: A Modern Approach , 2002 .

[32]  Markus Vincze,et al.  Segmentation of unknown objects in indoor environments , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[33]  Vincent Rabaud,et al.  Pose estimation of rigid transparent objects in transparent clutter , 2013, 2013 IEEE International Conference on Robotics and Automation.

[34]  Maciej Stefanczyk,et al.  Hypothesis Generation in Generic, Model-Based Object Recognition System , 2016, AUTOMATION.

[35]  Markus Vincze,et al.  Learning of perceptual grouping for object segmentation on RGB-D data , 2014, J. Vis. Commun. Image Represent..