Word juncture modeling using phonological rules for HMM-based continuous speech recognition

Between-word context-dependent phones have been proposed to provide a more precise phonetic representation of word junctures. This technique makes it possible to accurately model soft pronunciation changes (changes in which a phone undergoes a comparatively small alteration). However, hard pronunciation changes (changes in which a phone is completely deleted or replaced by a different phone) are much less frequent and hence cannot be modeled adequately due to the lack of training material. To overcome this problem a set of phonological rules is used to redefine word junctures, specifying how to replace or delete the boundary phones according to the neighboring phones. No new speech units are required, thus avoiding most of the training issues. Results, which are evaluated on the 991-word speaker-independent DARPA task, show that phonological rules are effective in providing corrective capability at low computational cost.<<ETX>>

[1]  Kai-Fu Lee,et al.  Automatic Speech Recognition , 1989 .

[2]  Chin-Hui Lee,et al.  A frame-synchronous network search algorithm for connected word recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[3]  Aaron E. Rosenberg Connected sentence recognition using diphone-like templates , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[4]  V. Zue,et al.  The role of phonological rules in speech understanding research , 1975 .

[5]  Chin-Hui Lee,et al.  Acoustic modeling for large vocabulary speech recognition , 1990 .

[6]  Lawrence R. Rabiner,et al.  A segmental k-means training procedure for connected word recognition , 1986, AT&T Technical Journal.

[7]  Patti Price,et al.  The DARPA 1000-word resource management database for continuous speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[8]  John Makhoul,et al.  Context-dependent modeling for acoustic-phonetic recognition of continuous speech , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.