End-to-End Indoor Navigation Assistance for the Visually Impaired Using Monocular Camera

In this work a novel approach for the problem of indoor navigation assistance for the visually impaired people is proposed based solely on a monocular camera. In our formulation for the problem, we cast it as an image classification problem and tackle it holistically in an end-to-end fashion via state-of-the-art deep convolutional residual networks. Given an input RGB image of an indoor scene, our model can accurately guide the visually impaired people to navigate around the obstacles in the scene using four discrete navigational directions. Our model has achieved resilient results in terms of higher classification accuracies with a lower rate of false alarms. Moreover, we compared the performance of our model against two baseline approaches and it has outperformed them with more than 25% improvements with respect to the F1 measure evaluation score.

[1]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[2]  Saeid Nahavandi,et al.  Body Parts Segmentation with Attached Props Using RGB-D Imaging , 2015, 2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[3]  Justus H. Piater,et al.  Monocular obstacle avoidance for blind people using probabilistic focus of expansion estimation , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[4]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Saeid Nahavandi,et al.  Body joints regression using deep convolutional neural networks , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[6]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[7]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[9]  Luis Miguel Bergasa,et al.  Assisting the Visually Impaired: Obstacle Detection and Warning System by Acoustic Feedback , 2012, Sensors.

[10]  Nahavandi Saeid,et al.  Semantic body parts segmentation for quadrupedal animals , 2016 .

[11]  Saeid Nahavandi,et al.  RGB-D human posture analysis for ergonomie studies using deep convolutional neural network , 2017, 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[12]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[13]  Josechu J. Guerrero,et al.  Navigation Assistance for the Visually Impaired Using RGB-D Sensor With Range Expansion , 2016, IEEE Systems Journal.

[14]  Saeid Nahavandi,et al.  A Skeleton-Free Fall Detection System From Depth Images Using Random Decision Forest , 2018, IEEE Systems Journal.

[15]  Vladlen Koltun,et al.  Colored Point Cloud Registration Revisited , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Saeid Nahavandi,et al.  Efficacy comparison of clustering systems for limb detection , 2014, 2014 9th International Conference on System of Systems Engineering (SOSE).

[17]  Saeid Nahavandi,et al.  Real Time Ergonomic Assessment for Assembly Operations Using Kinect , 2013, 2013 UKSim 15th International Conference on Computer Modelling and Simulation.

[18]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[19]  Saeid Nahavandi,et al.  A kinect-based workplace postural analysis system using deep residual networks , 2017, 2017 IEEE International Systems Engineering Symposium (ISSE).

[20]  Hans Limburg,et al.  Visual impairment and blindness in Hungary , 2018, Acta ophthalmologica.

[21]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[22]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Saeid Nahavandi,et al.  An adaptable system for RGB-D based human body detection and pose estimation: Incorporating attached props , 2016, 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[24]  Ramy Ashraf Zeineldin,et al.  Fast and accurate ground plane detection for the visually impaired from 3D organized point clouds , 2016, 2016 SAI Computing Conference (SAI).

[25]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.