论文信息 - The Recognition of Bimodal Produced Speech based on Multi-style Training

The Recognition of Bimodal Produced Speech based on Multi-style Training

In this paper an analysis on the recognition of words from speech database Whi-Spe in normal and whispered phonation, based on the conventional HMM/GMM framework, is presented. The analysis based on multi-style training is performed in the speaker dependent (SD) and speaker independent mode (SI). The analysis showed that a small portion of whisper data in training (10%) is required for the recognition of whisper higher than 90%, for both the SD and SI recognition.

Jovan Galić | Branko Marković | B. Markovic | J. Galic

[1] Dorde T. Grozdic,et al. Whispered Speech Database: Design, Processing and Application , 2013, TSD.

[2] Luca Romeo,et al. Convolutional Recurrent Neural Networks and Acoustic Data Augmentation for Snore Detection , 2019, Neural Approaches to Dynamics of Signal Exchanges.

[3] V C Tartter,et al. Identifiability of vowels and speakers from whispered syllables , 1991, Perception & psychophysics.

[4] Paul Boersma,et al. Praat, a system for doing phonetics by computer , 2002 .

[5] Amit M. Joshi,et al. Speaker Identification Through Natural and Whisper Speech Signal , 2019, Lecture Notes in Electrical Engineering.

[6] Slobodan Jovicic,et al. HTK-Based Recognition of Whispered Speech , 2014, SPECOM.

[7] Kazuya Takeda,et al. Analysis and recognition of whispered speech , 2005, Speech Commun..

[8] Vlado Delic,et al. HMM-based Whisper Recognition using μ-law Frequency Warping , 2018 .

[9] Misko Subotic,et al. Whispered speech recognition using deep denoising autoencoder , 2017, Eng. Appl. Artif. Intell..

[10] John H. L. Hansen,et al. UT-Vocal Effort II: Analysis and constrained-lexicon recognition of whispered speech , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11] M. V. Achuth Rao,et al. Formant-gaps Features for Speaker Verification Using Whispered Speech , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).