Unsupervised semantic indoor scene classification for robot vision based on context of features using Gist and HSV-SIFT

This paper presents an unsupervised scene classification method for actualizing semantic recognition of indoor scenes. Background and foreground features are respectively extracted using Gist and color scale-invariant feature transform (SIFT) as feature representations based on context. We used hue, saturation, and value SIFT (HSV-SIFT) because of its simple algorithm with low calculation costs. Our method creates bags of features for voting visual words created from both feature descriptors to a two-dimensional histogram. Moreover, our method generates labels as candidates of categories for time-series images while maintaining stability and plasticity together. Automatic labeling of category maps can be realized using labels created using adaptive resonance theory (ART) as teaching signals for counter propagation networks (CPNs). We evaluated our method for semantic scene classification using KTH’s image database for robot localization (KTH-IDOL), which is popularly used for robot localization and navigation. The mean classification accuracies of Gist, gray SIFT, one class support vector machines (OC-SVM), position-invariant robust features (PIRF), and our method are, respectively, 39.7, 58.0, 56.0, 63.6, and 79.4 %. The result of our method is 15.8 % higher than that of PIRF. Moreover, we applied our method for fine classification using our original mobile robot. We obtained mean classification accuracy of 83.2 % for six zones.

[1]  Sebastian Thrun Finding landmarks for mobile robot navigation , 1998, Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146).

[2]  K. Sato,et al.  Scene classification using unsupervised neural networks for mobile robot vision , 2012, 2012 Proceedings of SICE Annual Conference (SICE).

[3]  Antonio Torralba,et al.  How many pixels make an image? , 2009, Visual Neuroscience.

[4]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[5]  Laurent Itti,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Rapid Biologically-inspired Scene Classification Using Features Shared with Visual Attention , 2022 .

[6]  Aram Kawewong,et al.  Position-Invariant Robust Features for Long-Term Recognition of Dynamic Outdoor Scenes , 2010, IEICE Trans. Inf. Syst..

[7]  Masayuki Inaba,et al.  View-based approach to robot navigation , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[8]  Takayuki Kanda,et al.  Interactive Robots as Social Partners and Peer Tutors for Children: A Field Trial , 2004, Hum. Comput. Interact..

[9]  Hiroshi MORIOKA,et al.  Visual SLAM in Crowded Environments and Mobile Robot Navigation * , 2010 .

[10]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[11]  Hirokazu Madokoro,et al.  Selection of SIFT feature points for scene description in robot vision , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[12]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[13]  Hirokazu Madokoro,et al.  Unsupervised Feature Selection and Category Classification for a Vision-Based Mobile Robot , 2011, IEICE Trans. Inf. Syst..

[14]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[15]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[16]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[17]  Hirokazu Madokoro,et al.  Unsupervised Scene Classification Based on Context of Features for a Mobile Robot , 2011, KES.

[18]  Jitendra Malik,et al.  Normalized Cut and Image Segmentation , 1997 .

[19]  Barbara Caputo,et al.  Overview of the CLEF 2009 Robot Vision Track , 2009, CLEF.

[20]  Andrew Zisserman,et al.  Scene Classification Using a Hybrid Generative/Discriminative Approach , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Akihisa Ohya,et al.  Long Distance Outdoor Navigation of an Autonomous Mobile Robot by Playback of Perceived Route Map , 1997, ISER.