论文信息 - An efficient partial matching algorithm toward speech retrieval by speech

An efficient partial matching algorithm toward speech retrieval by speech

This paper proposes a new efficient partial matching algorithm, called Island Driven Partial Matching (IDPM) based on Continuous Dynamic Programming (CDP), to realize flexible retrieval from a speech database by query speech. IDPM enables detecting the sections in the speech database which match partial sections of the query speech efficiently. IDPM applies CDP to short and constant length of unit reference patterns, which are composed of the query speech, and finds the best matching island sections in the speech database. Arbitrary lengths of similar sections are detected by only checking those islands sections. Some experiments were conducted for conversational speech and the results showed IDPM enables the fast matching between arbitrary sections of the reference pattern and the input speech without declining the performance in detecting similar sections compared with our former method.

Shi-wook Lee | Kazuyo Tanaka | Yoshiaki Itoh

[1] Kunio Kashino,et al. Quick audio retrieval using active search , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[2] Kazuyo Tanaka,et al. A speech recognition method with a language-independent intermediate phonetic code , 2000, INTERSPEECH.

[3] Richard P. Lippmann,et al. Techniques for information retrieval from voice messages , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[4] Yoshiaki Itoh,et al. Automatic detection of topic boundaries and keywords in arbitrary speech using incremental reference interval-free continuous DP , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[5] S. J. Sinclair,et al. The development of the Otago speech database , 1995, Proceedings 1995 Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems.

[6] Kazuyo Tanaka,et al. Automatic labeling and digesting for lecture speech utilizing repeated speech by shift CDP , 2001, INTERSPEECH.

[7] Shi-wook Lee,et al. Robust Spoken Document Retrieval Based on Multilingual Subphonetic Segment Recognition , 2004, ICEIS.

[8] Kazuyo Tanaka,et al. Speech labeling and the most frequent phrase extraction using same section in a presentation speech , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9] Yoshiaki Itoh,et al. Speech data retrieval system constructed on a universal phonetic code domain , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..