Robust lip detection based on histogram of oriented gradient features and convolutional neural network under effects of light and background

Abstract Detection of the lip area is essential pre-processing for several applications, such as lip reading and visual information services. In this paper, we propose a lip detection method that finds the lip area in an image using the histogram of oriented gradient (HOG) features and convolutional neural network (CNN). We find the face area from an input image, divide the face image in half, and apply sliding window detection to the bottom half of the image. We obtain the HOG feature vector from the image that corresponds to the window, and use it as the input to a pre-trained support vector machine (SVM). HOG and SVM are used for coarse detection. If SVM determines that the image is not the lip, we reapply sliding window detection. Otherwise, the image is used as input to CNN, which is employed for fine detection and to determine whether the image is the lip. If CNN determines that the image is the lip, we apply canny edge detection to the image to obtain the mouth contour. We use MATLAB to confirm the effectiveness of our method, and can find the mouth area with over 94% accuracy and over 98% precision.

[1]  Carl N Stephan,et al.  Facial approximation: an evaluation of mouth-width determination. , 2003, American journal of physical anthropology.

[2]  Bertil Ekstrand,et al.  Analytical Steady State Solution for a Kalman Tracking Filter , 1983, IEEE Transactions on Aerospace and Electronic Systems.

[3]  S. A. Fattah,et al.  Lip contour extraction scheme using morphological reconstruction based segmentation , 2014, 2014 International Conference on Electrical Engineering and Information & Communication Technology.

[5]  Mohammed Bennamoun,et al.  A lip extraction algorithm using region-based ACM with automatic contour initialization , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[6]  Walid Mahdi,et al.  Lip Localization and Viseme Classification for Visual Speech Recognition , 2013, ArXiv.

[7]  Wan Khairunizam,et al.  Extracting Features Point of Lip Movement for Computer-based Lip Reading System , 2014 .

[8]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[9]  N. Mehta,et al.  An index for the measurement of normal maximum mouth opening. , 2003, Journal.

[10]  Li Xu,et al.  On Vectorization of Deep Convolutional Neural Networks for Vision Tasks , 2015, AAAI.

[11]  Frank Castella Sliding Window Detection Probabilities , 1976, IEEE Transactions on Aerospace and Electronic Systems.

[12]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[13]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[15]  Kang-Hyun Jo,et al.  HOG based Pedestrian Detection and Behavior Pattern Recognition for Traffic Signal Control , 2013 .

[16]  Young-Kyu Park,et al.  Convolutional Neural Network-based System for Vehicle Front-Side Detection , 2015 .

[17]  Euntai Kim,et al.  Part-based Hand Detection Using HOG , 2013 .

[18]  Sridha Sridharan,et al.  Lip detection for audio-visual speech recognition in-car environment , 2010, 10th International Conference on Information Science, Signal Processing and their Applications (ISSPA 2010).

[19]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[20]  Kerstin Dautenhahn,et al.  Socially intelligent robots: dimensions of human–robot interaction , 2007, Philosophical Transactions of the Royal Society B: Biological Sciences.

[21]  Nikos Fakotakis,et al.  An unconstrained method for lip detection in color images , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  Kwang-Eun Ko,et al.  Facial Point Classifier using Convolution Neural Network and Cascade Facial Point Detector , 2016 .

[23]  Sigeru Omatu,et al.  Lip detection by the use of neural networks , 2007, Artificial Life and Robotics.

[24]  Cordelia Schmid,et al.  Proceedings. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition , 2005 .

[25]  Jongmyeon Jeong A Lip Detection Algorithm Using Color Clustering , 2014 .

[26]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.