论文信息 - Mandarin singing voice synthesis using ANN vibrato parameter models

Mandarin singing voice synthesis using ANN vibrato parameter models

In this paper, the vibrato parameters of sung syllables are analyzed by using short-time Fourier transform and the method of analytic signal. After the vibrato parameter values for all training syllables are obtained, they are used to train an artificial neural network (ANN) for each type of vibrato parameter. Then, these ANN models are used to generate the values of vibrato parameters. Next, these parameter values and other music information are used together to control a harmonic-plus-noise (HNM) model to synthesize singing voice signals. With the synthetic singing voice, subjective perception tests are conducted. The result show that the singing voice synthesized with the ANN generated vibrato parameters is apparently more natural than the singing voice synthesized with fixed vibrato parameters.

Hung-Yan Gu | Zheng-Fu Lin

[1] Alex Loscos,et al. Sample-based singing voice synthesizer by spectral concatenation , 2003 .

[2] Xavier Rodet,et al. Synthesizing a choir in real-time using Pitch Synchronous Overlap Add (PSOLA) , 2000, ICMC.

[3] Y. Horii. Acoustic analysis of vocal vibrato: A theoretical interpretation of data , 1989 .

[4] Hung-Yan Gu,et al. Mandarin Singing Voice Synthesis Using an HNM Based Scheme , 2008, 2008 Congress on Image and Signal Processing.

[5] Yannis Stylianou,et al. Modeling Speech Based on Harmonic Plus Noise Models , 2004, Summer School on Neural Networks.

[6] Mark A. Clements,et al. A singing voice synthesis system based on sinusoidal modeling , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7] J I Shonle,et al. The pitch of vibrato tones. , 1980, The Journal of the Acoustical Society of America.

[8] F. Richard Moore,et al. Elements of computer music , 1990 .

[9] Kevin N. Gurney,et al. An introduction to neural networks , 2018 .

[10] Yannis Stylianou,et al. Harmonic plus noise models for speech, combined with statistical methods, for speech and speaker modification , 1996 .

[11] T. Strohmer,et al. Gabor Analysis and Algorithms: Theory and Applications , 1997 .