Mandarin singing voice synthesis using ANN vibrato parameter models

In this paper, the vibrato parameters of sung syllables are analyzed by using short-time Fourier transform and the method of analytic signal. After the vibrato parameter values for all training syllables are obtained, they are used to train an artificial neural network (ANN) for each type of vibrato parameter. Then, these ANN models are used to generate the values of vibrato parameters. Next, these parameter values and other music information are used together to control a harmonic-plus-noise (HNM) model to synthesize singing voice signals. With the synthetic singing voice, subjective perception tests are conducted. The result show that the singing voice synthesized with the ANN generated vibrato parameters is apparently more natural than the singing voice synthesized with fixed vibrato parameters.