Perceptual effects of spectral envelope and F0 manipulations using the STRAIGHT method
暂无分享,去创建一个
A versatile speech manipulation method, STRAIGHT (Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrogram) [Kawahara, ICASSP (1997)], was applied to natural speech in order to test the effects of speech parameter manipulations. STRAIGHT consists of three procedures called STRAIGHT‐core, TEMPO (Temporal domain Excitation extraction using a Minimum Perturbation Operator) and SPIKES (Synthetic Phase Impulse for Keeping Equivalent Sound). STRAIGHT‐core is a method to extract a smoothed time‐frequency representation, which is free from interferences due to the source periodicity. TEMPO is used to extract F0 and other source‐related information. SPIKES provides artificial ‘‘naturalness’’ to the synthetic speech. As a base line, resynthesized speech, using the STRAIGHT method, was found to provide equivalent ‘‘naturalness’’ compared to the original speech when no parametric modification was introduced. Simultaneous manipulation of the spectral envelope and F0 illustrated tha...
[1] Hideki Kawahara,et al. Speech representation and transformation using adaptive interpolation of weighted spectrum: vocoder revisited , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.