A word graph based N-best search in continuous speech recognition

The authors introduce an efficient algorithm for the exhaustive search of N-best sentence hypotheses in a word graph. The search procedure is based on a two-pass algorithm. In the first pass, a word graph is constructed with standard time-synchronous beam search. The actual extraction of N-best word sequences from the word graph takes place during the second pass. With the implementation of a tree-organized N-best list, the search is performed directly on the resulting word graph. Therefore, the parallel bookkeeping of N hypotheses at each processing step during the search is not necessary. It is important to point out that the proposed N-best search algorithm produces an exact N-best list as defined by the word graph structure. Possible errors can only result from pruning during the construction of the word graph. In a postprocessing step, the N candidates can be rescored with a more complex language model with highly reduced computational cost. This algorithm is also applied in speech understanding to select the most likely sentence hypothesis that satisfies some additional constraints.

[1]  Frank K. Soong,et al.  A Tree.Trellis Based Fast Search for Finding the N Best Sentence Hypotheses in Continuous Speech Recognition , 1990, HLT.

[2]  H. Ney,et al.  Improvements in beam search for 10000-word continuous speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Hermann Ney,et al.  A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..

[4]  Andreas Kellner,et al.  Improving speech understanding by incorporating database constraints and dialogue history , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[5]  R. Schwartz,et al.  The N-best algorithms: an efficient and exact procedure for finding the N most likely sentence hypotheses , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[6]  Volker Steinbiss,et al.  Sentence-hypotheses generation in a continuous-speech recognition system , 1989, EUROSPEECH.

[7]  Frank K. Soong,et al.  A Tree.Trellis Based Fast Search for Finding the N Best Sentence Hypotheses in Continuous Speech Recognition , 1990, HLT.

[8]  Chin-Hui Lee,et al.  A network-based frame-synchronous level building algorithm for connected word recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[9]  Hermann Ney,et al.  Data driven search organization for continuous speech recognition , 1992, IEEE Trans. Signal Process..

[10]  R. Schwartz,et al.  A comparison of several approximate algorithms for finding multiple (N-best) sentence hypotheses , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[11]  Hermann Ney,et al.  Word graphs: an efficient interface between continuous-speech recognition and language understanding , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Hermann Ney,et al.  Improvements in beam search , 1994, ICSLP.