An instantaneous-frequency-based pitch extraction method for high-quality speech transformation: revised TEMPO in the STRAIGHT-suite

A new source information extraction algorithm is proposed to provide a reliable source signal for a high-quality speech analysis, modification, and transformation system called STRAIGHT (Speech Transformation and Representation based on Adaptive Interpolation of weiGHTed spectrogram). The proposed method makes use of instantaneous frequencies in harmonic components based on their reliability. A performance evaluation is conducted using a simultaneous EGG (Electroglottograph) recording as the reference signal. The error variance for F0 extraction using the proposed algorithm is shown to be about 1/3 that of the previous F0 extraction method used in STRAIGHT, although the previous algorithm is still competitive with conventional F0 extraction methods.