HMM continuous speech recognition using predictive LR parsing

The authors propose a continuous-speech recognition method that uses an accurate and efficient parsing mechanism, an LR parser, and drives HMM (hidden Markov model) modules directly without any intervening structures such as a phoneme lattice. The method was tested in Japanese phrase recognition experiments. Two grammars were prepared, a general Japanese grammar and a task-specific grammar. The phrase recognition rate with the general grammar was 72% for top candidates and 95% for the five best candidates. With the task-specific grammar, recognition rate was 80% and 99% respectively.<<ETX>>

[1]  Masaru Tomita,et al.  Parsing noisy sentences , 1988, COLING.

[2]  Shigeru Katagiri,et al.  Acoustic-phonetic labels in a Japanese speech database , 1987, ECST.

[3]  Raj Reddy,et al.  Large-vocabulary speaker-independent continuous speech recognition: the sphinx system , 1988 .

[4]  Hermann Ney Dynamic programming speech recognition using a context-free grammar , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Seiichi Nakagawa Spoken sentence recognition by time-synchronous parsing algorithm of context-free grammar , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Hsiao-Wuen Hon,et al.  Large-vocabulary speaker-independent continuous speech recognition using HMM , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[7]  Masaru Tomita,et al.  Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems , 1985 .

[8]  M. Tomita,et al.  An efficient word lattice parsing algorithm for continuous speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Daniel H. Younger,et al.  Recognition and Parsing of Context-Free Languages in Time n^3 , 1967, Inf. Control..

[10]  S. Roucos,et al.  Statistical language modeling using a small corpus from an application domain , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[11]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[12]  K. Shikano Improvement of word recognition results by trigram model , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  John Makhoul,et al.  BYBLOS: The BBN continuous speech recognition system , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  Bruce Lowerre,et al.  The Harpy speech understanding system , 1990 .