论文信息 - A one-stage decoder for interpretation of natural speech

A one-stage decoder for interpretation of natural speech

Current speech understanding systems are typically designed as multistage systems, although this theoretically gives rise to errors due to early decisions. We present a framework that offers the chance of reducing these errors by an integrated system which directly computes a semantic tree representation from the input speech signal through a token passing based one-stage decoder, called ODINS. In order to limit the complexity of ODINS, we represent all a-priori knowledge consistently by a generalized uniform knowledge model based on a hierarchy of probabilistic transition networks, which also can be n-grams. Our framework includes a method to evaluate the system output using an edit distance based tree matching algorithm. First experiments quantify and confirm the theoretical advantage of the one-stage strategy over a corresponding two-stage approach.

[1] Wayne H. Ward,et al. A concept graph based confidence measure , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2] Xavier L. Aubert,et al. One pass cross word decoding for large vocabularies based on a lexical tree search organization , 1999, EUROSPEECH.

[3] Feng Zheng,et al. Generalized hierarchical search in the ISIP ASR system , 2001, Conference Record of Thirty-Fifth Asilomar Conference on Signals, Systems and Computers (Cat.No.01CH37256).

[4] Mehryar Mohri,et al. A Rational Design for a Weighted Finite-State Transducer Library , 1997, Workshop on Implementing Automata.

[5] Robert C. Moore. Using Natural-Language Knowledge Sources in Speech Recognition , 1999 .

[6] A. Waibel,et al. A one-pass decoder based on polymorphic linguistic context assignment , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[7] Roberto Pieraccini,et al. Learning how to understand language , 1993, EUROSPEECH.

[8] Kaizhong Zhang,et al. Approximate tree pattern matching , 1997 .

[9] Mei-Yuh Hwang,et al. Microsoft Windows highly intelligent speech recognizer: Whisper , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[10] Jeffrey D. Ullman,et al. Introduction to Automata Theory, Languages and Computation , 1979 .

[11] Günther Görz,et al. Towards understanding spontaneous speech: word accuracy vs. concept accuracy , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[12] Jonathan G. Fiscus,et al. Better alignment procedures for speech recognition evaluation , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13] Wolfgang Wahlster,et al. Verbmobil: Foundations of Speech-to-Speech Translation , 2000, Artificial Intelligence.

[14] Steve Young,et al. Token passing: a simple conceptual model for connected speech recognition systems , 1989 .