论文信息 - Bringing Vision to the Blind: From Coarse to Fine, One Dollar at a Time

Bringing Vision to the Blind: From Coarse to Fine, One Dollar at a Time

While deep learning has achieved great success in building vision applications for mainstream users, there is relatively less work for the blind and visually impaired to have a personal, on-device visual assistant for their daily life. Unlike mainstream applications, vision system for the blind must be robust, reliable and safe-to-use. In this paper, we propose a fine-grained currency recognizer based on CONGAS, which significantly surpasses other popular local features by a large margin. In addition, we introduce an effective and light-weight coarse classifier that gates the fine-grained recognizer on resource-constrained mobile devices. The coarse-to-fine approach is orchestrated to provide an extensible mobile-vision architecture, that demonstrates how the benefits of coordinating deep learning and local feature based methods can help in resolving a challenging problem for the blind and visually impaired. The proposed system runs in real-time with ~150ms latency on a Pixel device, and achieved 98% precision and 97% recall on a challenging evaluation set.

[1] Iasonas Kokkinos,et al. UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Aoba Masato,et al. Euro Banknote Recognition System Using a Three - layered Perceptron and RBF Networks , 2003 .

[3] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[4] A. Rajaei,et al. Feature Extraction of Currency Notes: An Approach Based on Wavelet Transform , 2012, 2012 Second International Conference on Advanced Computing & Communication Technologies.

[5] Xu Liu,et al. A camera phone based currency reader for the visually impaired , 2008, Assets '08.

[6] Hrishikesh B. Aradhye,et al. Video2Text: Learning to Annotate Video Content , 2009, 2009 IEEE International Conference on Data Mining Workshops.

[7] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8] Xiaodong Yang,et al. Robust and Effective Component-Based Banknote Recognition for the Blind , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[9] Eugenio Culurciello,et al. Flattened Convolutional Neural Networks for Feedforward Acceleration , 2014, ICLR.

[10] Yang Song,et al. Tour the world: Building a web-scale landmark recognition engine , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Md. Shahjahan,et al. A currency recognition system using negatively correlated neural network ensemble , 2009, 2009 12th International Conference on Computers and Information Technology.

[12] Jae-Kang Lee,et al. Distinctive Point Extraction and Recognition Algorithm for Various Kinds of Euro Banknotes , 2004 .

[13] Iyad Abu Doush,et al. Currency recognition using a smartphone: Comparison between color SIFT and gray scale SIFT algorithms , 2017, J. King Saud Univ. Comput. Inf. Sci..

[14] Anni Cai,et al. A reliable method for paper currency recognition based on LBP , 2010, 2010 2nd IEEE InternationalConference on Network Infrastructure and Digital Content.

[15] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[16] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[17] Hai-dong Wang,et al. A paper currency number recognition based on fast Adaboost training algorithm , 2011, 2011 International Conference on Multimedia Technology.

[18] Nassir Navab,et al. SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[19] Hassan Foroosh,et al. Factorized Convolutional Neural Networks , 2016, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[20] Hamid Hassanpour,et al. Using Hidden Markov Models for paper currency recognition , 2009, Expert Syst. Appl..

[21] C. V. Jawahar,et al. Currency Recognition on Mobile Phones , 2014, 2014 22nd International Conference on Pattern Recognition.

[22] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.

[23] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] John J. Leonard,et al. Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age , 2016, IEEE Transactions on Robotics.

[25] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).