Wavelet method of speech segmentation

In this paper a new method of speech segmentation is suggested. It is based on power fluctuations of the wavelet spectrum for a speech signal. In most approaches to speech recognition, the speech signals are segmented using constant-time segmentation. Constant segmentation needs to use windows to decrease the boundary distortions. A more natural approach is to segment the speech signals on the basis of time-frequency analysis. Boundaries are assigned in places where some energy of a frequency band rapidly changes. Most methods of non-constant segmentation need training for particular data or are realized as a part of modelling. In this paper we apply the discrete wavelet transform (DWT) to analyse speech signals, the resulting power spectrum and its derivatives. This information allows us to locate the boundaries of phonemes. It is the first stage of speech recognition process. Additionally we present an evaluation by comparing our method with hand segmentation. The segmentation method proves effective for finding most phoneme boundaries. Results are more useful for speech recognition than constant segmentation.

[1]  I. Daubechies Ten Lectures on Wavelets , 1992 .

[2]  Omar Farooq,et al.  Wavelet based robust sub-band features for phoneme recognition , 2004 .

[3]  V.W. Zue,et al.  The use of speech knowledge in automatic speech recognition , 1985, Proceedings of the IEEE.

[4]  Shrikanth S. Narayanan,et al.  Piecewise linear stylization of pitch via wavelet analysis , 2005, INTERSPEECH.

[5]  Youngjik Lee,et al.  Phoneme segmentation of continuous speech using multi-layer perceptron , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[6]  Hermann Ney,et al.  A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..

[7]  Mari Ostendorf,et al.  From HMM's to segment models: a unified view of stochastic modeling for speech recognition , 1996, IEEE Trans. Speech Audio Process..

[8]  O. Rioul,et al.  Wavelets and signal processing , 1991, IEEE Signal Processing Magazine.

[9]  S. Young Large Vocabulary Continuous Speech Recognition : a ReviewSteve , 1996 .

[10]  C. Weinstein,et al.  A system for acoustic-phonetic analysis of continuous speech , 1975 .

[11]  Khalid Daoudi,et al.  Frequency and Wavelet Filtering for Robust Speech Recognition , 2003, ICANN.

[12]  David B. Grayden,et al.  Phonemic segmentation of fluent speech , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[13]  Zekeriya Tufekci,et al.  Mel-scaled discrete wavelet coefficients for speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).