Ultrasonic sensing for robust speech recognition

In this paper, we present our work using ultrasonic sensing of speech for digit recognition. First, a set of spectral ultrasonic features are developed and tuned in order to achieve optimal performance for the digit recognition task. Using these features, we demonstrate an overall accuracy of 33.00% on a digit recognition task using HMMs with recordings from 6 speakers. The results indicate that ultrasonic sensing of speech is viable, but that further work is needed to achieve word accuracies that match those of audio. Finally, experimental results are presented which demonstrate that fusing information from ultrasound and audio sources show marginal improvements over audio-only performances.

[1]  Bhiksha Raj,et al.  Ultrasonic Doppler Sensor for Voice Activity Detection , 2007, IEEE Signal Processing Letters.

[2]  James R. Glass,et al.  Multimodal speech recognition with ultrasonic sensors , 2007, INTERSPEECH.

[3]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[4]  Bhiksha Raj,et al.  Ultrasonic Doppler sensor for speaker recognition , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Yoni Bauduin,et al.  Audio-Visual Speech Recognition , 2004 .