论文信息 - Speech spectral envelope estimation through explicit control of peak evolution in time

Speech spectral envelope estimation through explicit control of peak evolution in time

This work proposes a new approach to estimating the speech spectral envelope that is adapted for applications requiring time-varying spectral modifications, such as Voice Conversion. In particular, we represent the spectral envelope as a sum of peaks that evolve smoothly in time, within a phoneme. Our representation provides a flexible model for the spectral envelope that pertains relevantly to human speech production and perception. We highlight important properties of the proposed spectral envelope estimation, as applied to natural speech, and compare results with those from a more traditional frame-by-frame cepstrum-based analysis. Subjective evaluations and comparisons of synthesized speech quality, as well as implications of this work in future research are also discussed.

Olivier Rosec | Thierry Chonavel | Elizabeth Godoy

[1] Shigeru Katagiri,et al. Bayesian modelling of the speech spectrum using mixture of Gaussians , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2] Jean Laroche,et al. New phase-vocoder techniques for pitch-shifting, harmonizing and other exotic effects , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[3] Masato Akagi,et al. Spectral modification for voice gender conversion using temporal decomposition (Special section on papers awarded the student paper award at NCSP'07) , 2007 .

[4] Nguyen Binh Phu. Studies on spectral modification in voice transformation , 2009 .

[5] Levent M. Arslan,et al. Robust processing techniques for voice conversion , 2006, Comput. Speech Lang..

[6] Yannis Stylianou,et al. Harmonic plus noise models for speech, combined with statistical methods, for speech and speaker modification , 1996 .