Extracting Semantic Information from Visual Data: A Survey

The traditional environment maps built by mobile robots include both metric ones and topological ones. These maps are navigation-oriented and not adequate for service robots to interact with or serve human users who normally rely on the conceptual knowledge or semantic contents of the environment. Therefore, the construction of semantic maps becomes necessary for building an effective human-robot interface for service robots. This paper reviews recent research and development in the field of visual-based semantic mapping. The main focus is placed on how to extract semantic information from visual data in terms of feature extraction, object/place recognition and semantic representation methods.

[1]  James J. Little,et al.  Automated Spatial-Semantic Modeling with Applications to Place Labeling and Informed Search , 2009, 2009 Canadian Conference on Computer and Robot Vision.

[2]  Ryad Benosman,et al.  RGBD object recognition and visual texture classification for indoor semantic mapping , 2012, 2012 IEEE International Conference on Technologies for Practical Robot Applications (TePRA).

[3]  Miguel Cazorla,et al.  Semantic localization in the PCL library , 2016, Robotics Auton. Syst..

[4]  Sungho Kim,et al.  Biologically Motivated Novel Localization Paradigm by High-Level Multiple Object Recognition in Panoramic Images , 2015, TheScientificWorldJournal.

[5]  James U. Korein,et al.  Robotics , 2018, IBM Syst. J..

[6]  Rainer Lienhart,et al.  Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection , 2003, DAGM-Symposium.

[7]  James M. Rehg,et al.  Visual Place Categorization: Problem, dataset, and algorithm , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Arturo Espinosa-Romero,et al.  Talking to Godot: dialogue with a mobile robot , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Andreas Kolb,et al.  Kinect range sensing: Structured-light versus Time-of-Flight Kinect , 2015, Comput. Vis. Image Underst..

[10]  Kamlesh Mistry,et al.  Adaptive facial point detection and emotion recognition for a humanoid robot , 2015, Comput. Vis. Image Underst..

[11]  Andrew Y. Ng,et al.  Autonomous sign reading for semantic mapping , 2011, 2011 IEEE International Conference on Robotics and Automation.

[12]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[13]  Gaurav S. Sukhatme,et al.  Semantic Mapping Using Mobile Robots , 2008, IEEE Transactions on Robotics.

[14]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[15]  Rangachar Kasturi,et al.  Machine vision , 1995 .

[16]  João Ascenso,et al.  Evaluation of low-complexity visual feature detectors and descriptors , 2013, 2013 18th International Conference on Digital Signal Processing (DSP).

[17]  Alessandro Saffiotti,et al.  Robots that Change Their World: Inferring Goals from Semantic Knowledge , 2011, ECMR.

[18]  Roland Siegwart,et al.  Cognitive maps for mobile robots - an object based approach , 2007, Robotics Auton. Syst..

[19]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[20]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[21]  Tom Drummond,et al.  Multiple Target Localisation at over 100 FPS , 2009, BMVC.

[22]  Roland Siegwart,et al.  Bayesian space conceptualization and place classification for semantic maps in mobile robotics , 2008, Robotics Auton. Syst..

[23]  Barbara Caputo,et al.  Visual Servoing to Help Camera Operators Track Better , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[25]  Renato F. Salas-Moreno Dense semantic SLAM , 2014 .

[26]  Kurt Konolige,et al.  CenSurE: Center Surround Extremas for Realtime Feature Detection and Matching , 2008, ECCV.

[27]  Illah R. Nourbakhsh,et al.  Appearance-based place recognition for topological localization , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[28]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[29]  Antonios Gasteratos,et al.  Learning spatially semantic representations for cognitive robot navigation , 2013, Robotics Auton. Syst..

[30]  Michael Beetz,et al.  Leaving Flatland: Toward real-time 3D navigation , 2009, 2009 IEEE International Conference on Robotics and Automation.

[31]  Peter Corke,et al.  Active text perception for mobile robots , 2012 .

[32]  Tom Drummond,et al.  Faster and Better: A Machine Learning Approach to Corner Detection , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[34]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[35]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[36]  Sebastian Thrun,et al.  Robotic mapping: a survey , 2003 .

[37]  Ian D. Reid,et al.  A Dynamic Programming Approach to Reconstructing Building Interiors , 2010, ECCV.

[38]  Alireza Bab-Hadiashar,et al.  A real-time RGB-D registration and mapping approach by heuristically switching between photometric and geometric information , 2014, 17th International Conference on Information Fusion (FUSION).

[39]  António Paulo Moreira,et al.  A visual place recognition procedure with a Markov chain based filter , 2014, 2014 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC).

[40]  Sebastian Thrun,et al.  Detecting and modeling doors with mobile robots , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[41]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[42]  Radu Bogdan Rusu,et al.  Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments , 2010, KI - Künstliche Intelligenz.

[43]  Matthew B. Blaschko,et al.  Combining Local and Global Image Features for Object Class Recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[44]  Alan L. Yuille,et al.  Manhattan World: compass direction from a single image by Bayesian inference , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[45]  Daniele Nardi,et al.  A proposal for semantic map representation and evaluation , 2015, 2015 European Conference on Mobile Robots (ECMR).

[46]  R. D’Andrea,et al.  A World Wide Web for Robots • , 2011 .

[47]  Antonio Torralba,et al.  Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes , 2003, NIPS.

[48]  Vincent Lepetit,et al.  Dominant orientation templates for real-time detection of texture-less objects , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[49]  Alessandro Saffiotti,et al.  An introduction to the anchoring problem , 2003, Robotics Auton. Syst..

[50]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[52]  Benjamin Kuipers,et al.  Dynamic visual understanding of the local environment for an indoor navigating robot , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[53]  Mark D. Dunlop,et al.  Image retrieval by hypertext links , 1997, SIGIR '97.

[54]  Frank Dellaert,et al.  Semantic Modeling of Places using Objects , 2007, Robotics: Science and Systems.

[55]  Laurent Jeanpierre,et al.  Decentralized Multi-Robot Planning to Explore and Perceive , 2014, ECAI 2014.

[56]  Alessandro Saffiotti,et al.  Robot task planning using semantic maps , 2008, Robotics Auton. Syst..

[57]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[58]  Dieter Fox,et al.  RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments , 2012, Int. J. Robotics Res..

[59]  Darius Burschka,et al.  Adaptive and Generic Corner Detection Based on the Accelerated Segment Test , 2010, ECCV.

[60]  Marjorie Skubic,et al.  Spatial language for human-robot dialogs , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[61]  David Filliat,et al.  A visual bag of words method for interactive qualitative localization and mapping , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[62]  Cordelia Schmid,et al.  Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[63]  Laurent Itti,et al.  Biologically Inspired Mobile Robot Vision Localization , 2009, IEEE Transactions on Robotics.

[64]  Qi Tian,et al.  A survey of recent advances in visual feature detection , 2015, Neurocomputing.

[65]  Davide Scaramuzza,et al.  Robot localization using soft object detection , 2012, 2012 IEEE International Conference on Robotics and Automation.

[66]  Jitendra Malik,et al.  Spectral Partitioning with Indefinite Kernels Using the Nyström Extension , 2002, ECCV.

[67]  John K. Tsotsos,et al.  Robot navigation via spatial and temporal coherent semantic maps , 2016, Eng. Appl. Artif. Intell..

[68]  N. El-Sheimy,et al.  Automatic Traffic Lane Detection for Mobile Mapping Systems , 2011, 2011 International Workshop on Multi-Platform/Multi-Sensor Remote Sensing and Mapping.

[69]  Joachim Hertzberg,et al.  Towards semantic maps for mobile robots , 2008, Robotics Auton. Syst..

[70]  Il Hong Suh,et al.  Semantic mapping and navigation: A Bayesian approach , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[71]  Javier Civera,et al.  Towards semantic SLAM using a monocular camera , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[72]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[73]  Yasar Ayaz,et al.  Text Detection and Recognition for Semantic Mapping in Indoor Navigation , 2015, 2015 5th International Conference on IT Convergence and Security (ICITCS).

[74]  Andrew Zisserman,et al.  Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[75]  Cipriano Galindo,et al.  Multi-hierarchical semantic maps for mobile robotics , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[76]  Wolfram Burgard,et al.  Supervised semantic labeling of places using information extracted from sensor data , 2007, Robotics Auton. Syst..

[77]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[78]  Antonios Gasteratos,et al.  Semantic mapping for mobile robotics tasks: A survey , 2015, Robotics Auton. Syst..

[79]  Vincent Lepetit,et al.  Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes , 2011, 2011 International Conference on Computer Vision.

[80]  Nico Blodow,et al.  Autonomous semantic mapping for robots performing everyday manipulation tasks in kitchen environments , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[81]  Wolfram Burgard,et al.  Conceptual spatial representations for indoor mobile robots , 2008, Robotics Auton. Syst..

[82]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[83]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[84]  Javier Ruiz-del-Solar,et al.  Semantic Mapping of Large-Scale Outdoor Scenes for Autonomous Off-Road Driving , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[85]  Jonathan Crespo Herrero,et al.  An Inferring Semantic System Based on Relational Models for Mobile Robotics , 2015, 2015 IEEE International Conference on Autonomous Robot Systems and Competitions.

[86]  Jean Oh,et al.  Grounding spatial relations for outdoor robot navigation , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[87]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[88]  Dorian Gálvez-López,et al.  Bags of Binary Words for Fast Place Recognition in Image Sequences , 2012, IEEE Transactions on Robotics.

[89]  Hai Zhuge,et al.  Retrieve images by understanding semantic links and clustering image fragments , 2004, J. Syst. Softw..

[90]  Gerhard Lakemeyer,et al.  Exploring artificial intelligence in the new millennium , 2003 .

[91]  Jörg Stückler,et al.  Semantic mapping using object-class segmentation of RGB-D images , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[92]  Mitra Basu,et al.  Gaussian-based edge-detection methods - a survey , 2002, IEEE Trans. Syst. Man Cybern. Part C.

[93]  Paul Newman,et al.  Integrating metric and semantic maps for vision-only automated parking , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[94]  James J. Little,et al.  Curious George: An attentive semantic robot , 2008, Robotics Auton. Syst..

[95]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[96]  Laurent Itti,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Rapid Biologically-inspired Scene Classification Using Features Shared with Visual Attention , 2022 .

[97]  Kun Li,et al.  Indoor scene recognition via probabilistic semantic map , 2012, 2012 IEEE International Conference on Automation and Logistics.

[98]  Barbara Caputo,et al.  Multi-modal Semantic Place Classification , 2010, Int. J. Robotics Res..

[99]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[100]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[101]  Xiaodong Yang,et al.  Computer Vision-Based Door Detection for Accessibility of Unfamiliar Environments to Blind Persons , 2010, ICCHP.

[102]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[103]  Moritz Tenorth,et al.  RoboEarth Semantic Mapping: A Cloud Enabled Knowledge-Based Approach , 2015, IEEE Transactions on Automation Science and Engineering.