A Vision Aid for the Visually Impaired using Commodity Dual-Rear-Camera Smartphones

Dual- (or multiple) rear cameras on hand-held smartphones are believed to be the future of mobile photography. Recently, many of such new has been released (mainly with dual-rear cameras: one wide-angle and one telephoto). Some of the notable ones are Apple iPhone 7 and 8 Plus, iPhone X, Samsung Galaxy S9, LG V30, Huawei Mate 10. With built-in dual-camera systems, these devices are capable of not only producing better quality picture but also acquiring 3D stereo photos (with depth information collected). Thus, they are capable of capturing the moment in life with depth just like our two eye system. Thanks to this current trend, these phones are now getting cheaper while becoming more power complete. In this paper, we describe a system that makes use of the commercial dual rear-camera phones such as the iPhone X, to provide aids for people who are visually impaired. We propose a design to place the phone on the chest centre of the user who has one or two Bluetooth headphone(s) plugged into the ears to listen to the phone audio outputs. Our system is consist of three modules: (1) the scene context recognition to audio, (2) the 3D stereo reconstruction to audio, and (3) the interactive audio/voice controls. In slightly more detail, the wide-angle camera captures live photos to be investigated by a GPS guided Deep Learning process to describe the scene in front of him/herself (module 1). The telephoto camera captures the more narrow-angle and thus to be stereo reconstructed with the aids of the wide angle’s one to form a depth map (densed area-based distance map). The map helps determine the distance to all visible object(s) to notify the user with critical ones (module 2). This module also makes the phone vibrate when an object(s) located close enough to the user, e.g. within hand reach distance. The user can also query the system by asking various questions to get automatic voice answering (module 3). In addition, a manual rescue module (module 4) is also added when other things have gone wrong. An example of the vision to audio could be ”Overall, likely a corridor, one medium object is 0.5 m away - central left”, or ”Overall, city pathway, front cleared”. Audio command input may be ”read texts”, and the phone will detect and read all texts on closest object. More details on the design and implementation are further described in this paper.

[1]  J Faria,et al.  Electronic white cane for blind people navigation assistance , 2010, 2010 World Automation Congress.

[2]  Dorra Sellami Masmoudi,et al.  New electronic cane for visually impaired people for obstacle detection and recognition , 2012, 2012 IEEE International Conference on Vehicular Electronics and Safety (ICVES 2012).

[3]  Bolei Zhou,et al.  Places: A 10 Million Image Database for Scene Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  John Morris,et al.  Blind and Deaf Consumer Preferences for Android and iOS Smartphones , 2014 .

[5]  Ian P. Howard,et al.  Binocular Vision and Stereopsis , 1996 .

[6]  H. Taylor,et al.  Number of People Blind or Visually Impaired by Cataract Worldwide and in World Regions, 1990 to 2010. , 2015, Investigative ophthalmology & visual science.

[7]  Mohamed Lamine Mekhalfi,et al.  Recovering the Sight to blind People in indoor Environments with smart Technologies , 2016, Expert Syst. Appl..

[8]  Martin Lauer,et al.  An Intuitive Mobility Aid for Visually Impaired People Based on Stereo Vision , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[9]  Radu Nicolescu,et al.  Stereo-based bokeh effects for photography , 2016, Machine Vision and Applications.

[10]  Iwan Ulrich,et al.  The GuideCane-a computerized travel aid for the active guidance of blind pedestrians , 1997, Proceedings of International Conference on Robotics and Automation.

[11]  Ali Akbar Siddiqui,et al.  3D Stereoscopic Aid for Visually Impaired , 2016 .

[12]  Stefano Mattoccia,et al.  A wearable mobility aid for the visually impaired based on embedded 3D vision and deep learning , 2016, 2016 IEEE Symposium on Computers and Communication (ISCC).

[13]  Miguel A. Labrador,et al.  The Mobile Phone , 2010 .