Integration of Diverse Recognition Methodologies Through Reevaluation of N-Best Sentence Hypotheses

This paper describes a general formalism for integrating two or more speech recognition technologies, which could be developed at different research sites using different recognition strategies. In this formalism, one system uses the N-best search strategy to generate a list of candidate sentences; the list is rescored by other systems; and the different scores are combined to optimize performance. Specifically, we report on combining the BU system based on stochastic segment models and the BBN system based on hidden Markov models. In addition to facilitating integration of different systems, the N-best approach results in a large reduction in computation for word recognition using the stochastic segment model.

[1]  B. Juang,et al.  Context-dependent Phonetic Hidden Markov Models for Speaker-independent Continuous Speech Recognition , 2008 .

[2]  Richard M. Schwartz,et al.  Efficient, High-Performance Algorithms for N-Best Search , 1990, HLT.

[3]  Richard M. Schwartz,et al.  A Simple Statistical Class Grammar for Measuring Speech Recognition Performance , 1989, HLT.

[4]  Mari Ostendorf,et al.  Fast Search Algorithms for Connected Phone Recognition Using the Stochastic Segment Model , 1990, HLT.

[5]  Mari Ostendorf,et al.  Improvements in the Stochastic Segment Model for Phoneme Recognition , 1989, HLT.

[6]  Mari Ostendorf,et al.  A Dynamical System Approach to Continuous Speech Recognition , 1991, HLT.

[7]  John Makhoul,et al.  BYBLOS: The BBN continuous speech recognition system , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  S. Rocous,et al.  Stochastic segment modeling using the estimate-maximize algorithm , 1988 .

[9]  Richard M. Schwartz,et al.  The N-Best Algorithm: Efficient Procedure for Finding Top N Sentence Hypotheses , 1989, HLT.

[10]  Kai-Fu Lee,et al.  Context-independent phonetic hidden Markov models for speaker-independent continuous speech recognition , 1990 .

[11]  George Zavaliagkos,et al.  Continuous speech recognition using segmental neural nets , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[12]  Frank K. Soong,et al.  A Tree.Trellis Based Fast Search for Finding the N Best Sentence Hypotheses in Continuous Speech Recognition , 1990, HLT.

[13]  William H. Press,et al.  Numerical recipes , 1990 .

[14]  Mari Ostendorf,et al.  A stochastic segment model for phoneme-based continuous speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..