Toward an assisted indoor scene perception for blind people with image multilabeling strategies

A novel coarse indoor scene description framework for blind people is introduced.Three image multilabeling implementation strategies are proposed.Experimental validation was conducted on different public indoor environments.Results qualify the approach for a near real time blind assistance technology. In this work, we present novel strategies to coarsely describe indoor scenes by listing the objects surrounding a blind person equipped with a portable digital camera. They rely on a new multilabeling approach which consists in computing the similarity between a query image and a set of multilabeled images stored in a library in order to pick up the most similar images. Since each image of the library conveys its own list of objects, the co-occurrence of objects between the most similar images is exploited to "multilabel" the query image. The multilabeling approach is implemented by means of three different strategies. They are respectively based on the scale invariant feature transform (SIFT), the notion of bag of words, and principal component analysis (PCA). The proposed methods were tested on datasets corresponding to two different public indoor sites. Promising results have been obtained and suggest that near real-time implementation can be envisioned for describing public indoor environments with numerous predefined objects and with a good accuracy.

[1]  Shuai Yuan,et al.  Clothes Matching for Blind and Color Blind People , 2010, ICCHP.

[2]  Yingli Tian,et al.  Assistive Text Reading from Complex Background for Blind Persons , 2011, CBDAR.

[3]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[4]  Haitao Zhao,et al.  A novel incremental principal component analysis and its application for face recognition , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[5]  Suranga Nanayakkara,et al.  EyeRing: a finger-worn assistant , 2012, CHI EA '12.

[6]  Hakil Kim,et al.  Novel and efficient pedestrian detection using bidirectional PCA , 2013, Pattern Recognit..

[7]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[8]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[9]  A. Goshtasby Similarity and Dissimilarity Measures , 2012 .

[10]  Farid Melgani,et al.  Active Learning Methods for Electrocardiographic Signal Classification , 2010, IEEE Transactions on Information Technology in Biomedicine.

[11]  Junjie Wu,et al.  Advances in K-means clustering: a data mining thinking , 2012 .

[12]  Christina R. Victor,et al.  Emotional well-being in people with sight loss , 2010 .

[13]  Kaizhu Huang,et al.  Robust Text Detection in Natural Scene Images , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Rong Jin,et al.  Understanding bag-of-words model: a statistical framework , 2010, Int. J. Mach. Learn. Cybern..

[16]  Samarendra Dandapat,et al.  Multichannel ECG Data Compression Based on Multiscale Principal Component Analysis , 2012, IEEE Transactions on Information Technology in Biomedicine.

[17]  Lie Guo,et al.  Pedestrian detection for intelligent transportation systems combining AdaBoost algorithm and support vector machine , 2012, Expert Syst. Appl..

[18]  Palaiahnakote Shivakumara,et al.  A robust arbitrary text detection system for natural scene images , 2014, Expert Syst. Appl..

[19]  Priya Narasimhan,et al.  Trinetra: Assistive Technologies for Grocery Shopping for the Blind , 2006, 2006 10th IEEE International Symposium on Wearable Computers.

[20]  Chaur-Chin Chen,et al.  Similarity Measurement Between Images , 2005, COMPSAC.

[21]  Li Jiang,et al.  The Research on Blind Navigation System Based on RFID , 2007, 2007 International Conference on Wireless Communications, Networking and Mobile Computing.

[22]  Satoshi Hashino,et al.  A blind guidance system for street crossings based on ultrasonic sensors , 2010, The 2010 IEEE International Conference on Information and Automation.

[23]  Christophe Jouffrais,et al.  Fusion of Artificial Vision and GPS to Improve Blind Pedestrian Positioning , 2011, 2011 4th IFIP International Conference on New Technologies, Mobility and Security.

[24]  I. Jolliffe Principal Component Analysis , 2002 .

[25]  J. Koenderink,et al.  Representation of local geometry in the visual system , 1987, Biological Cybernetics.

[26]  Farid Melgani,et al.  Missing-Area Reconstruction in Multispectral Images Under a Compressive Sensing Perspective , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[27]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Norman Abramson,et al.  Information theory and coding , 1963 .

[29]  Kyung-Joong Kim,et al.  Design of a visual perception model with edge-adaptive Gabor filter and support vector machine for traffic sign detection , 2013, Expert Syst. Appl..

[30]  Andrew Zisserman,et al.  MLESAC: A New Robust Estimator with Application to Estimating Image Geometry , 2000, Comput. Vis. Image Underst..

[31]  Werapon Chiracharit,et al.  Banknote and coin speaker device for blind people , 2009, 2009 11th International Conference on Advanced Communication Technology.

[32]  Jaime H. Sanchez,et al.  Independent Outdoor Mobility for the Blind , 2007, 2007 Virtual Rehabilitation.

[33]  Pedro Pinho,et al.  Indoor guidance system for the blind and the visually impaired , 2012 .

[34]  Yingli Tian,et al.  Finding objects for assisting blind people , 2013, Network Modeling Analysis in Health Informatics and Bioinformatics.