Mobile Device-based Speech Enhancement System Using Lip-reading

This paper describes our preliminary study towards a new type of speech enhancement system. To avoid using odd-looking electrolarynx, we used lip-reading function. Our final image is to use a smart phone with camera and audio output to be able to convert the lip motion to speech output. We tested MLP, CNN, and MobileNets image recognition methods. 3k image datasets for training and testing were recorded from five persons. The preliminary experiment indicated that the MobileNets is the most adequate algorithm for smart phone apps. in terms of the recognition accuracy and the calculation cost.

[1]  Neeru Rathee Investigating back propagation neural network for lip reading , 2016, 2016 International Conference on Computing, Communication and Automation (ICCCA).

[2]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[3]  Bor-Shyh Lin,et al.  Development of Novel Lip-Reading Recognition Algorithm , 2017, IEEE Access.

[4]  Josephine Sullivan,et al.  One millisecond face alignment with an ensemble of regression trees , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Davis E. King Max-Margin Object Detection , 2015, ArXiv.