A method for automatic extraction of parameters of the fundamental frequency contour

The process of generating an F0 contour has been modeled quite accurately in mathematical terms by Fujisaki and his coworkers, but the derivation of the underlying commands from an observed F0 contour is an inverseproblem that cannot be solved analytically. Although it can be solved by successive approximation, a good first-order approximation is necessary to guarantee an efficient and accurate search for the optimum solution. The present paper describes a method for pre-processing an observed F0 contour to obtain a smooth contour, from which a good first-order approximation can be analytically obtained. Experimental results show that correct extraction rates for the accent and the phrase commands are about 90% and 79%, respectively.

[1]  Keikichi Hirose,et al.  A scheme for pitch extraction of speech using autocorrelation function with frame length proportional to the time lag , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Hansjörg Mixdorff,et al.  A novel approach to the fully automatic extraction of Fujisaki model parameters , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[3]  Keikichi Hirose,et al.  Analysis of voice fundamental frequency contours for declarative sentences of Japanese , 1984 .

[4]  Keikichi Hirose,et al.  Detection of phrase boundaries in Japanese by low-pass filtering of fundamental frequency contours , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[5]  Shuichi Narusawa,et al.  Pre-processing of fundamental frequency contours of speech for automatic parameter extraction , 2000, WCC 2000 - ICSP 2000. 2000 5th International Conference on Signal Processing Proceedings. 16th World Computer Congress 2000.

[6]  Edouard Geoffrois A pitch contour analysis guided by prosodic event detection , 1993, EUROSPEECH.

[7]  Sumio Ohno,et al.  A command-response model for F0 contour generation in multilingual speech synthesis , 1998, SSW.

[8]  Keikichi Hirose,et al.  A System for the Synthesis of High-Quality Speech from Texts on General Weather Conditions (Special Section on Speech Synthesis: Current Technologies and Equipment) , 1993 .