A spoken language translator for restricted-domain context-free languages

Abstract An effort is underway at AT&T Bell Laboratories and Telefonica Investigacion y Desarrollo to build a restricted domain spoken language translation system, which we call VEST (Voice English/Spanish Translator). The eventual goal is a voice output translator which is speaker-independent, and has a vocabulary of several thousand words covering a specific application. This paper describes the first step of our research, a system which recognizes two speakers in each of Spanish and English and is limited to some four hundred words. The key new idea is that the speech recognition and the language analysis are tightly coupled by using the same language model, an augmented phrase-structure grammar, for both.

[1]  Rebecca N. Wright,et al.  Finite-State Approximation of Phrase Structure Grammars , 1991, ACL.

[2]  Alfred V. Aho,et al.  Principles of Compiler Design , 1977 .

[3]  Miguel Ángel Rodríguez Crespo,et al.  Generation of duration rules for a Spanish text-to-speech synthesizer , 1991, EUROSPEECH.

[4]  Jonathan Slocum,et al.  A Survey of Machine Translation: Its History, Current Status and Future Prospects , 1985, CL.

[5]  S.E. Levinson,et al.  A perspective on speech recognition , 1990, IEEE Communications Magazine.

[6]  Stuart M. Shieber,et al.  An Introduction to Unification-Based Approaches to Grammar , 1986, CSLI Lecture Notes.

[7]  J. Olive,et al.  Text to speech—An overview , 1985 .

[8]  Mark Liberman,et al.  Synthesis by rule of english intonation patterns , 1984, ICASSP '84. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Kenneth Ward Church,et al.  Morphology and rhyming: two powerful alternatives to letter-to-sound rules for speech synthesis , 1990, SSW.

[10]  Michael A. Harrison,et al.  Introduction to formal language theory , 1978 .

[11]  Alex Waibel,et al.  JANUS: a speech-to-speech translation system using connectionist and symbolic processing strategies , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[12]  Allen Louis Gorin,et al.  Incorporating syntax into the level-building algorithm on a tree-structured parallel computer , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[13]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[14]  Jay Earley,et al.  An efficient context-free parsing algorithm , 1970, Commun. ACM.

[15]  Hirotani Kitano,et al.  Phi DM-Dialog: an experimental speech-to-speech dialog translation system , 1991, Computer.