Extraction of important sentences using F0 information for speech summarization

This paper describes speech summarization using F0 information. The speech summarization in this work is realized by the extraction of important sentences from text data transcribed by hand. The important problem in this framework is automatic scoring of sentence importance based on prosodic information from speech wave as well as linguistic information from written text. Prosody conveys nonlinguistic information such as speaker’s intention and contributes to identify important parts of speech. The prosodic information is represented in terms of F0 parameters of Japanese bunsetsu unit, which is almost equivalent to a prosodic minor phrase. Six kinds of F0 parameters are compared in regard to correlation to the sentence importance and performance of extracting important sentences. Evaluation results show that introduction of F0 parameters is effective to the speech summarization.