论文信息 - DWT-based phonetic groups classification using neural networks

DWT-based phonetic groups classification using neural networks

This paper presents an improvement of the discrete wavelet transform (DWT)-based phonetic classification algorithm by using neural networks (NN) to learn optimal thresholds for speech classification. Two feedforward NNs (two layers) operate on input features extracted from speech frames (10 ms length) by DWT and statistical measurement in order to classify these frames as transient, voiced vowel, voiced consonant and unvoiced consonant categories. Hard thresholds in our earlier paper are used to detect silence and voiced closure intervals. The new algorithm is tested with the TIMIT database and compared with other algorithms to demonstrate its superior performance.

Tuan Van Pham | Gernot Kubin | T. V. Pham | G. Kubin

[1] John B. Shoven,et al. I , Edinburgh Medical and Surgical Journal.

[2] Donald G. Childers,et al. Silent and voiced/unvoiced/mixed excitation (four-way) classification of speech , 1989, IEEE Trans. Acoust. Speech Signal Process..

[3] Bobby R. Hunt,et al. Voiced-unvoiced-silence classifications of speech using hybrid features and a network classifier , 1993, IEEE Trans. Speech Audio Process..

[4] Tuan Van Pham,et al. DWT-based classification of acoustic-phonetic classes and phonetic units , 2004, INTERSPEECH.

[5] Noureddine Ellouze,et al. Speech classification in noisy environment using subband decomposition , 2003, Seventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings..

[6] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[7] A. El-Jaroudi,et al. Voiced-unvoiced-silence classification of speech using neural nets , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[8] P Niyogi,et al. Detecting stop consonants in continuous speech. , 2002, The Journal of the Acoustical Society of America.

[9] L. Siegel,et al. Voiced/Unvoiced/Mixed excitation classification of speech , 1982 .

[10] Yuan Baozong,et al. The consonant/vowel (C/V) speech classification using high-rank function neural network (HRFNN) , 1996, Proceedings of Third International Conference on Signal Processing (ICSP'96).