An Indoor Room Classification System for Social Robots via Integration of CNN and ECOC

The ability to classify rooms in a home is one of many attributes that are desired for social robots. In this paper, we address the problem of indoor room classification via several convolutional neural network (CNN) architectures, i.e., VGG16, VGG19, & Inception V3. The main objective is to recognize five indoor classes (bathroom, bedroom, dining room, kitchen, and living room) from a Places dataset. We considered 11600 images per class and subsequently fine-tuned the networks. The simulation studies suggest that cleaning the disparate data produced much better results in all the examined CNN architectures. We report that VGG16 & VGG19 fine-tuned models with training on all layers produced the best validation accuracy, with 93.29% and 93.61% on clean data, respectively. We also propose and examine a combination model of CNN and a multi-binary classifier referred to as error correcting output code (ECOC) with the clean data. The highest validation accuracy of 15 binary classifiers reached up to 98.5%, where the average of all classifiers was 95.37%. CNN and CNN-ECOC, and an alternative form called CNN-ECOC Regression, were evaluated in real-time implementation on a NAO humanoid robot. The results show the superiority of the combination model of CNN and ECOC over the conventional CNN. The implications and the challenges of real-time experiments are also discussed in the paper.

[1]  Sven Wachsmuth,et al.  Indoor Scene Classification Using Combined 3D and Gist Features , 2010, ACCV.

[2]  Trevor Hastie,et al.  The Error Coding Method and PICTs , 1998 .

[3]  Ales Leonardis,et al.  Part-based room categorization for household service robots , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Yann LeCun,et al.  Convolutional networks and applications in vision , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[5]  Ryo Kurazume,et al.  Categorization of Indoor Places Using the Kinect Sensor , 2012, Sensors.

[6]  Matthew R. Boutell,et al.  Home Interior Classification using SIFT Keypoint Histograms , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[8]  Daniel Cremers,et al.  Image-Based Localization Using LSTMs for Structured Feature Correlation , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[9]  Kenneth W. Shum,et al.  Deep Representation Learning with Target Coding , 2015, AAAI.

[10]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[11]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[12]  Anderson Rocha,et al.  Multiclass From Binary: Expanding One-Versus-All, One-Versus-One and ECOC-Based Approaches , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Thomas A. Funkhouser,et al.  Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Ruizhi Chen,et al.  Scene Recognition for Indoor Localization Using a Multi-Sensor Fusion Approach , 2017, Sensors.

[15]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[17]  M. Aly Survey on Multiclass Classification Methods , 2005 .

[18]  Wolfram Burgard,et al.  Semantic Place Classification of Indoor Environments with Mobile Robots Using Boosting , 2005, AAAI.

[19]  Danijel Skocaj,et al.  Room classification using a hierarchical representation of space , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  Malrey Lee,et al.  The skin cancer classification using deep convolutional neural network , 2018, Multimedia Tools and Applications.

[21]  Ching Y. Suen,et al.  Error-Correcting Output Coding for the Convolutional Neural Network for Optical Character Recognition , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[22]  Wolfram Burgard,et al.  Supervised semantic labeling of places using information extracted from sensor data , 2007, Robotics Auton. Syst..

[23]  Javier V. Gómez,et al.  Indoor Furniture and Room Recognition for a Robot Using Internet-Derived Models and Object Context , 2012, 2012 10th International Conference on Frontiers of Information Technology.

[24]  Nicholas Roy,et al.  Indoor scene recognition through object detection , 2010, 2010 IEEE International Conference on Robotics and Automation.

[25]  Riccardo Campa The rise of social robots : a review of the recent literature , 2016 .

[26]  Hesham F. A. Hamed,et al.  Two-phase multi-model automatic brain tumour diagnosis system from magnetic resonance images using convolutional neural networks , 2018, EURASIP Journal on Image and Video Processing.

[27]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[29]  Miguel Cazorla,et al.  ViDRILO: The Visual and Depth Robot Indoor Localization with Objects information dataset , 2015, Int. J. Robotics Res..

[30]  Ben J. A. Kröse,et al.  From images to rooms , 2007, Robotics Auton. Syst..

[31]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[32]  Markus Vincze,et al.  Functional Room Detection and Modeling using Stereo Imagery in Domestic Environments , 2011 .

[33]  Eugenio Culurciello,et al.  An Analysis of Deep Neural Network Models for Practical Applications , 2016, ArXiv.

[34]  Yuya Kajikawa,et al.  Bibliometric Analysis of Social Robotics Research: Identifying Research Trends and Knowledgebase , 2017 .

[35]  Panagiotis Louridas,et al.  Machine Learning , 2016, IEEE Software.

[36]  José García Rodríguez,et al.  Finding the Place: How to Train and Use Convolutional Neural Networks for a Dynamically Learning Robot , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[37]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[38]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).