Repeated utterance extraction by a new algorithm for labeling a presentation speech

This paper first proposes a new efficient algorithm for extracting similar sections between two time sequence data sets. The algorithm is called Relay Continuous Dynamic Programming: Relay CDP, which realizes fast matching between arbitrary sections in the reference pattern and the input speech and enables extracting similar sections frame-synchronously. We extend Relay CDP to extract repeated utterances in a presentation speech because the repeated utterances, such as the same words, same phrases or same sentences, are assumed to be important phrases in the speech. Those repeated utterances can be regarded as some labels for information retrieval. This paper describes the detail of Relay CDP and the performance, which is evaluated for extracting the similar sections between two speech data sets and for identifying the repeated utterances by extracting similar sections in a presentation speech.

[1]  S. J. Sinclair,et al.  The development of the Otago speech database , 1995, Proceedings 1995 Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems.

[2]  Hitoshi Isahara,et al.  Spontaneous Speech Corpus of Japanese , 2000, LREC.

[3]  Kazuyo Tanaka,et al.  Speech labeling and the most frequent phrase extraction using same section in a presentation speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Francine R. Chen,et al.  The use of emphasis to automatically summarize a spoken discourse , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Richard P. Lippmann,et al.  Techniques for information retrieval from voice messages , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[6]  Kazuyo Tanaka,et al.  A speech recognition method with a language-independent intermediate phonetic code , 2000, INTERSPEECH.

[7]  Kazuyo Tanaka,et al.  Automatic labeling and digesting for lecture speech utilizing repeated speech by shift CDP , 2001, INTERSPEECH.

[8]  Yoshiaki Itoh,et al.  Automatic detection of topic boundaries and keywords in arbitrary speech using incremental reference interval-free continuous DP , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[9]  Yoshiaki Itoh,et al.  Speech data retrieval system constructed on a universal phonetic code domain , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[10]  Kunio Kashino,et al.  Quick audio retrieval using active search , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[11]  Kiyohiro Shikano,et al.  Julius - an open source real-time large vocabulary recognition engine , 2001, INTERSPEECH.