Methods for Partial Sentence Recognition and Unknown Words Detection by Sentence Spotting on Continuous Speech

Spontaneous speech includes many sentences that fall outside the task domain. Furthermore, the boundary between sentences is often unclear in spontaneous speech because of the likes of corrections, stammering or overlap with the next utterance. We previously developed a sentence spotting system that uses Vector ContinuousDynamic Programming (VCDP). This system works well for sentence spotting in spontaneous speech [1] because it is not required to consider sentence boundaries and utterances which fall outside the task domain. The previous system supported only “complete sentence” utterances. However, partial sentences that are intended to convey almost the same meaning as complete sentences, and which consist of parts of complete sentences often appear in spontaneous speech. We must be able to deal with such expressions to enable flexible recognition, even though such partial sentences are subject to a wide degree of variation. We propose a means of extending a sentence spotting algorithm that is capable of efficiently accepting partial sentences [2]. The processing of unknown words is one of the most important factors in dealing with spontaneous speech, because an utterance will often include words which are unknown to the system. We also propose an unknown word detection algorithm in the sentence spotting framework. We have extended the sentence spotting algorithm such that it is now capable of accepting partial sentences and detecting unknown words [3].

[1]  Yoshiaki Itoh,et al.  Sentence spotting applied to partial sentences and unknown words , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Kiyohiro Shikano,et al.  Japanese Phonetic Typewriter Using HMM Phone Recognition and Stochastic Phone-Sequence Modeling , 1991 .

[3]  Yoshiaki Itoh,et al.  Spontaneous speech recognition by sentence spotting , 1993, EUROSPEECH.