Construction of Corpus-Based Syntactic Rules for Accurate Speech Recognition

This paper describes the syntactic rules which are applied in the Japanese speech recognition module of a speech-to-speech translation system. Japanese is considered to be a free word/phrase order language. Since syntactic rules are applied as constraints to reduce the search space in speech recognition, applying rules which take into account all possible phrase orders can have almost the same effect as using no constraints. Instead, we take into consideration the recognition weaknesses of certain syntactic categories and treat them precisely, so that a minimal number of rules can work most effectively. In this paper we first examine which syntactic categories are easily misrecognized. Second, we consult our dialogue corpus, in order to provide the rules with great generality. Based on both studies, we refine the rules. Finally, we verify the validity of the refinement through speech recognition experiments.

[1]  Kenji Kita,et al.  Linguistic constraints for continuous speech recognition in goal-directed dialogue , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[2]  K. Sakuma The structure of the Japanese language , 1951 .

[3]  Kentaro Ogura,et al.  Word Sequence Constraints for Japanese Speech Recognition , 1990, ECAI.

[4]  Sadaoki Furui,et al.  A continuous speech recognition system based on a two-level grammar approach , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[5]  Marco Ferretti,et al.  Measuring information provided by language model and acoustic model in probabilistic speech recognition: Theory and experimental results , 1990, Speech Commun..

[6]  N. A. Mccawley,et al.  The structure of the Japanese language , 1973 .

[7]  Kenji Kita,et al.  HMM continuous speech recognition using predictive LR parsing , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[8]  Hsiao-Wuen Hon,et al.  Large-vocabulary speaker-independent continuous speech recognition using HMM , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[9]  Terumasa Ehara,et al.  Utilizing empirical data for postposition classification toward spoken Japanese speech recognition , 1991, EUROSPEECH.

[10]  Tetsunosuke Fujisaki,et al.  A Stochastic Approach to Sentence Parsing , 1984, ACL.

[11]  Kentaro Ogura,et al.  ATR dialogue database , 1990, ICSLP.

[12]  Hermann Ney,et al.  Dynamic programming speech recognition using a context-free grammar , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Hitoshi Iida,et al.  A Method to Predict the Next Utterance Using a Four-Layered Plan Recognition Model , 1990, European Conference on Artificial Intelligence.

[14]  Raj Reddy,et al.  Large-vocabulary speaker-independent continuous speech recognition: the sphinx system , 1988 .

[15]  Kiyohiro Shikano,et al.  Integration of speech recognition and language processing in spoken language translation system (SL-TRANS) , 1990, ICSLP.