Non-Contact Speech Recovery Technology Using a 24 GHz Portable Auditory Radar and Webcam

Language has been one of the most effective ways of human communication and information exchange. To solve the problem of non-contact robust speech recognition, recovery, and surveillance, this paper presents a speech recovery technology based on a 24 GHz portable auditory radar and webcam. The continuous-wave auditory radar is utilized to extract the vocal vibration signal, and the webcam is used to obtain the fitted formant frequency. The traditional formant speech synthesizer is selected to synthesize and recover speech, using the vocal vibration signal as the sound source excitation and the fitted formant frequency as the vocal tract resonance characteristics. Experiments on reading single English characters and words are carried out. Using microphone records as a reference, the effectiveness of the proposed speech recovery technology is verified. Mean opinion scores show a relatively high consistency between the synthesized speech and original acoustic speech.

[1]  Sandra Costanzo,et al.  Software-Defined Doppler Radar Sensor for Human Breathing Detection , 2019, Sensors.

[2]  Yang Zhang,et al.  Detection of the Vibration Signal from Human Vocal Folds Using a 94-GHz Millimeter-Wave Radar , 2017, Sensors.

[3]  Zhang Ziqi,et al.  Advancements in Bio-radar Speech Signal Detection Technology , 2016 .

[4]  Xiaohua Zhu,et al.  Time-Varying Vocal Folds Vibration Detection Using a 24 GHz Portable Auditory Radar , 2016, Sensors.

[5]  Sheng Li,et al.  Noise Suppression in 94 GHz Radar-Detected Speech Based on Perceptual Wavelet Packet , 2016, Entropy.

[6]  Changzhi Li,et al.  Accurate DC offset calibration of Doppler radar via non-convex optimisation , 2015 .

[7]  Changzhan Gu,et al.  Assessment of Human Respiration Patterns via Noncontact Sensing Using Doppler Multi-Radar System , 2015, Sensors.

[8]  Peng Zhang,et al.  A new speech enhancement algorithm for millimeter‐wave radar speech sensor , 2014 .

[9]  Dominique Zosso,et al.  Variational Mode Decomposition , 2014, IEEE Transactions on Signal Processing.

[10]  Changzhi Li,et al.  A Review on Recent Advances in Doppler Radar Sensors for Noncontact Healthcare Monitoring , 2013, IEEE Transactions on Microwave Theory and Techniques.

[11]  Changzhan Gu,et al.  Analysis and Experiment on the Modulation Sensitivity of Doppler Radar Vibration Measurement , 2013, IEEE Microwave and Wireless Components Letters.

[12]  Hao Lv,et al.  Smart radar sensor for speech detection and enhancement , 2013 .

[13]  Aggelos K. Katsaggelos,et al.  Noncontact Millimeter-Wave Real-Time Detection and Tracking of Heart Rate on an Ambulatory Subject , 2012, IEEE Transactions on Information Technology in Biomedicine.

[14]  Alireza Behrad,et al.  Lip contour extraction using RGB color space and fuzzy c-means clustering , 2010, 2010 IEEE 9th International Conference on Cyberntic Intelligent Systems.

[15]  Sheng-Fuh Chang,et al.  Microwave Human Vocal Vibration Signal Detection Based on Doppler Radar Technology , 2010, IEEE Transactions on Microwave Theory and Techniques.

[16]  Yanfeng Li,et al.  A Novel Radar Sensor for the Non-Contact Detection of Speech Signals , 2010, Sensors.

[17]  Huaguo Zang,et al.  Laser Doppler vibrometer for real-time speech-signal acquirement , 2009 .

[18]  Phuong K. Tran,et al.  Bone conduction microphone: Head sensitivity mapping for speech intelligibility and sound quality , 2008, 2008 International Conference on Audio, Language and Image Processing.

[19]  Thomas S. Huang,et al.  LDV Remote Voice Acquisition and Enhancement , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[20]  B. Yegnanarayana,et al.  Language identification in noisy environments using throat microphone signals , 2005, Proceedings of 2005 International Conference on Intelligent Sensing and Information Processing, 2005..

[21]  J. Kobler,et al.  Measurements of glottal structure dynamics. , 2005, The Journal of the Acoustical Society of America.

[22]  Jenshan Lin,et al.  Range correlation and I/Q performance benefits in single-chip silicon Doppler radars for noncontact cardiopulmonary monitoring , 2004, IEEE Transactions on Microwave Theory and Techniques.

[23]  J. Suykens,et al.  Recurrent least squares support vector machines , 2000 .

[24]  John F. Holzrichter,et al.  Denoising of human speech using combined acoustic and EM sensor signal processing , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[25]  Gregory C. Burnett,et al.  The use of glottal electromagnetic micropower sensors (GEMS) in determining a voiced excitation function , 1999 .

[26]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[27]  E. Paulus,et al.  Speech Signal Processing , 1997, The Electrical Engineering Handbook - Six Volume Set.

[28]  Zong-Wen Li Millimeter Wave Radar for detecting the speech signal applications , 1996 .

[29]  Wouter Olthuis,et al.  A review of silicon microphones , 1994 .

[30]  M.G. Bellanger,et al.  Digital processing of speech signals , 1980, Proceedings of the IEEE.

[31]  C G Caro,et al.  Contactless apnoea detector based on radar. , 1971, Lancet.

[32]  E. C. Wente A Condenser Transmitter as a Uniformly Sensitive Instrument for the Absolute Measurement of Sound Intensity , 1917 .

[33]  K. Moffett,et al.  Remote Sens , 2015 .

[34]  Yong Huang,et al.  Microwave life-detection systems for searching human subjects under earthquake rubble or behind barrier , 2000, IEEE Transactions on Biomedical Engineering.

[35]  J. Holzrichter,et al.  Speech articulator measurements using low power EM-wave sensors. , 1998, The Journal of the Acoustical Society of America.