Automatic detection of topic boundaries and keywords in arbitrary speech using incremental reference interval-free continuous DP

We propose an approach for detecting topic boundaries and keywords in arbitrary speech, with neither recognition nor prosodic processing, aiming at guide access to the content of recorded raw speech. This approach is based on the general tendency that frequently repeated phrases/words in speech are characteristic of topics in discourse. So it uses pairs of phonetically similar segments (PPSSs) of speech to represent topics in speech. This approach has the advantage of being domain and language independent and robust against variations in the speaker and background noise, as it needs neither a language nor acoustic model in advance. Experiments using simulated dialogues confirmed the good performance of this approach. We also propose Incremental Reference Interval Free Continuous Dynamic Programming (IRIFCDP) as an algorithm for detecting PPSSs in speech for the above method. IRIFCDP can detect PPSSs efficiently in synchronization with the speech, so it is suitable for handling long speech samples.

[1]  Francine R. Chen,et al.  The use of emphasis to automatically summarize a spoken discourse , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Ryu-ichi Oka,et al.  Speaker-independent word speech recognition using the blurred orientation patterns obtained from the vector field of spectrum , 1988, [1988 Proceedings] 9th International Conference on Pattern Recognition.

[3]  Yoshiaki Itoh,et al.  A proposal for a new algorithm of reference interval-free continuous DP for real-time speech or text retrieval , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[4]  Barry Arons,et al.  SpeechSkimmer: interactively skimming recorded speech , 1993, UIST '93.

[5]  Yoshiaki Itoh,et al.  Spontaneous speech recognition by sentence spotting , 1993, EUROSPEECH.