Face Recognition in Video Streams for Mobile Assistive Devices Dedicated to Visually Impaired

In this paper, we introduce a novel face detection and recognition system based on deep convolutional networks, designed to improve the visually impaired users' interaction and communication in social encounters. A first feature of the proposed architecture concerns a face detection system able to identify various persons existent in the scene regardless of the subject location or pose. Then, the faces are tracked between successive frames using a CNN (Convolutional Neural Networks) based tracker trained offline with generic motion patterns. The system can handle face occlusion, rotation or pose variation, as well as important illumination changes. Finally, the faces are recognized, in real-time, directly from the video stream. The major contribution of the paper consists in a novel weight adaptation scheme able to determine the relevance of face instances and to create a global, fixed-size representation from all face instances tracked during the video stream. The experimental evaluation performed on a set of 30 video elements validates the approach with average detection and recognition scores superior to 85%.

[1]  D.S. Hedin,et al.  Smartphone based face recognition tool for the blind , 2010, 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology.

[2]  Anderson Rocha,et al.  A Kinect-Based Wearable Face Recognition System to Aid Visually Impaired Users , 2017, IEEE Transactions on Human-Machine Systems.

[3]  Ajmal S. Mian,et al.  Face recognition based on Kinect , 2015, Pattern Analysis and Applications.

[4]  Ajmal S. Mian,et al.  Using Kinect for face recognition under varying poses, expressions, illumination and disguise , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[5]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[6]  Huaizu Jiang,et al.  Face Detection with the Faster R-CNN , 2016, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017).

[7]  Rohitash Chandra,et al.  Design of a Mobile Face Recognition System for Visually Impaired Persons , 2015, ArXiv.

[8]  Xiaoou Tang,et al.  Facial Landmark Detection by Deep Multi-task Learning , 2014, ECCV.

[9]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[10]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[11]  Mohammed Yeasin,et al.  FEPS: a sensory substitution system for the blind to perceive facial expressions , 2012, ASSETS '12.

[12]  Ruxandra Tapu,et al.  DEEP-SEE: Joint Object Detection, Tracking and Recognition with Application to Visually Impaired Navigational Assistance , 2017, Sensors.

[13]  Ruxandra Tapu,et al.  Single object tracking using offline trained deep regression networks , 2017, 2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA).

[14]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Minho Lee,et al.  Smart Cane: Face Recognition System for Blind , 2015, HAI.

[16]  Rohitash Chandra,et al.  Face detection and recognition in an unconstrained environment for mobile visual assistive system , 2017, Appl. Soft Comput..

[17]  Michael Felsberg,et al.  The Visual Object Tracking VOT2017 Challenge Results , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[18]  Giovanni Fusco,et al.  Combining Retrieval and Classification for Real-Time Face Recognition , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[19]  Shuo Yang,et al.  WIDER FACE: A Face Detection Benchmark , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Lina J. Karam,et al.  A No-Reference Objective Image Sharpness Metric Based on the Notion of Just Noticeable Blur (JNB) , 2009, IEEE Transactions on Image Processing.

[22]  Aparecido Nilceu Marana,et al.  3DLBP and HAOG fusion for face recognition utilizing Kinect as a 3D scanner , 2015, SAC.