The Research of Endpoint Detection and Initial/Final Segmentation for Chinese Whispered Speech

This paper gives the suitable character parameters for endpoint detection and initial/final (I/F) segmentation of whispered speech based on waveform characteristics. At first, the whispered speech is analyzed to two parts approximation and detail - by discrete wavelet transform (DWT). Then fractal dimension of approximation part is calculated to be the character parameter of speech and non speech segmentation. And the energy ratio of the approximation and detail part (DAER) is calculated to be the parameter of initial and final segmentation. Finally, an efficient detection algorithm is given to achieve the endpoint detection and I/F segmentation. The experiments show that this is a simple and good efficiency algorithm, which is a suitable method for endpoint detection and UF segmentation of whispered speech

[1]  R. Sengupta,et al.  COMPARATIVE STUDY OF FRACTAL BEHAVIOR IN QUASI-RANDOM AND QUASI-PERIODIC SPEECH WAVE MAP , 2001 .

[2]  Petros Maragos,et al.  Speech analysis and feature extraction using chaotic models , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Zhang Wenjun,et al.  Improved Bayesian approach to robust speech segmentation , 2003, International Conference on Neural Networks and Signal Processing, 2003. Proceedings of the 2003.

[4]  H.B. Nugraha,et al.  Segmented fractal dimension measurement of 1-D signals: a wavelet based method , 2002, Asia-Pacific Conference on Circuits and Systems.

[5]  Lingyun Gu,et al.  A new robust algorithm for isolated word endpoint detection , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Xu Bo Optimization of speech endpoint detection base on sub-band energy feature , 2005 .

[7]  K.-C. Wang,et al.  Robust endpoint detection algorithm based on the adaptive band-partitioning spectral entropy in adverse environments , 2005, IEEE Transactions on Speech and Audio Processing.

[8]  Li Xue Initial/final segmentation of Chinese whispered speech based on the auditory model , 2004 .

[9]  S. K. Mullick,et al.  NONLINEAR DYNAMICAL ANALYSIS OF SPEECH , 1996 .

[10]  Tuan Van Pham,et al.  DWT-based phonetic groups classification using neural networks , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[11]  Mark A. Clements,et al.  Reconstruction of speech from whispers , 2002, MAVEBA.

[12]  Xue-Li Li,et al.  Formant comparison between whispered and voiced vowels in mandarin , 2005 .

[13]  Kazuya Takeda,et al.  Analysis and recognition of whispered speech , 2005, Speech Commun..