LipPass: Lip Reading-based User Authentication on Smartphones Leveraging Acoustic Signals

To prevent users' privacy from leakage, more and more mobile devices employ biometric-based authentication approaches, such as fingerprint, face recognition, voiceprint authentications, etc., to enhance the privacy protection. However, these approaches are vulnerable to replay attacks. Although state-of-art solutions utilize liveness verification to combat the attacks, existing approaches are sensitive to ambient environments, such as ambient lights and surrounding audible noises. Towards this end, we explore liveness verification of user authentication leveraging users' lip movements, which are robust to noisy environments. In this paper, we propose a lip reading-based user authentication system, LipPass, which extracts unique behavioral characteristics of users' speaking lips leveraging build-in audio devices on smartphones for user authentication. We first investigate Doppler profiles of acoustic signals caused by users' speaking lips, and find that there are unique lip movement patterns for different individuals. To characterize the lip movements, we propose a deep learning-based method to extract efficient features from Doppler profiles, and employ Support Vector Machine and Support Vector Domain Description to construct binary classifiers and spoofer detectors for user identification and spoofer detection, respectively. Afterwards, we develop a binary tree-based authentication approach to accurately identify each individual leveraging these binary classifiers and spoofer detectors with respect to registered users. Through extensive experiments involving 48 volunteers in four real environments, LipPass can achieve 90.21% accuracy in user identification and 93.1% accuracy in spoofer detection.

[1]  Robert P. W. Duin,et al.  Support vector domain description , 1999, Pattern Recognit. Lett..

[2]  Sangki Yun,et al.  Strata: Fine-Grained Acoustic-based Device-Free Tracking , 2017, MobiSys.

[3]  Cam-Tu Nguyen,et al.  SilentTalk: Lip reading through ultrasonic sensing on mobile phones , 2017, IEEE INFOCOM 2017 - IEEE Conference on Computer Communications.

[4]  Shan Suthaharan,et al.  Support Vector Machine , 2016 .

[5]  Juergen Luettin,et al.  Speaker identification by lipreading , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[6]  Desney S. Tan,et al.  SoundWave: using the doppler effect to sense gestures , 2012, CHI.

[7]  G. Chetty,et al.  Multi-Level Liveness Verification for Face-Voice Biometric Authentication , 2006, 2006 Biometrics Symposium: Special Session on Research at the Biometric Consortium Conference.

[8]  Kuldip K. Paliwal,et al.  Feature extraction and dimensionality reduction algorithms and their applications in vowel recognition , 2003, Pattern Recognit..

[9]  William Stafford Noble,et al.  Support vector machine , 2013 .

[10]  Jie Yang,et al.  VoiceLive: A Phoneme Localization based Liveness Detection for Voice Authentication on Smartphones , 2016, CCS.

[11]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[12]  C. Sidney Burrus,et al.  Generalized digital Butterworth filter design , 1998, IEEE Trans. Signal Process..

[13]  Paul L. Rosin,et al.  Assessing the Uniqueness and Permanence of Facial Actions for Use in Biometric Applications , 2010, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[14]  J. Yan,et al.  Password memorability and security: empirical results , 2004, IEEE Security & Privacy Magazine.

[15]  Wei Wang,et al.  Device-free gesture tracking using acoustic signals , 2016, MobiCom.

[16]  Sangki Yun,et al.  Turning a Mobile Device into a Mouse in the Air , 2015, MobiSys.