Continuous Arabic Speech Segmentation using FFT Spectrogram

This paper describes a phoneme segmentation algorithm that uses fast Fourier transform (FFT) spectrogram. The algorithm has been implemented and tested for utterances of continuous Arabic speech of 10 male speakers that contain almost 2346 phonemes in total. The recognition system determines the phoneme boundaries and identifies them as pauses, vowels and consonants. The system uses intensity and phoneme duration for separating pauses from consonants. Intensity in particular is used to detect two specific consonants (/r/, /hf) when they are not detected through the spectrographic information. Segmentation accuracy of 95.39% for the overall system has been achieved

[1]  N. Carbonell,et al.  APHODEX, design and implementation of an acoustic-phonetic decoding expert system , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  P.-E. Stern,et al.  An expert system for speech spectrogram reading , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Ronald A. Cole,et al.  Experiments on spectrogram reading , 1979, ICASSP.

[4]  Gökhan Tür,et al.  Prosody-based automatic segmentation of speech into sentences and topics , 2000, Speech Commun..

[5]  Silvia Pfeiffer,et al.  Pause concepts for audio segmentation at different semantic levels , 2001, MULTIMEDIA '01.

[6]  Akira Ichikawa,et al.  Phoneme recognition in continuous speech , 1982, ICASSP.

[7]  M. Brent Speech segmentation and word discovery: a computational perspective , 1999, Trends in Cognitive Sciences.

[8]  Ernest A. Edmonds,et al.  Automatic Speech Recognition Based on Spectrogram Reading , 1986, Int. J. Man Mach. Stud..

[9]  Riichiro Mizoguchi,et al.  A continuous speech recognition system based on knowledge engineering techniques , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Lori Lamel,et al.  An expert spectrogram reader: A knowledge-based approach to speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  Jan Van der Spiegel,et al.  Acoustic‐phonetic features for the automatic recognition of stop consonants , 1998 .

[12]  P. S. Gopalakrishnan,et al.  Models and algorithms for continuous speech recognition: a brief tutorial , 1993, Proceedings of 36th Midwest Symposium on Circuits and Systems.

[13]  Jan Van der Spiegel,et al.  An acoustic-phonetic feature-based system for the automatic recognition of fricative consonants , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[14]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[15]  Kiyohiro Shikano,et al.  Phoneme segmentation using spectrogram reading knowledge , 1989, International Conference on Acoustics, Speech, and Signal Processing,.