In this paper, we propose a novel representation of F0 contours that provides a computationally efficient algorithm for automatically estimating the parameters of a F0 control model for singing voices. Although the best known F0 control model, based on a second-order system with a piece-wise constant function as its input, can generate F0 contours of natural singing voices, this model has no means of learning the model parameters from observed F0 contours automatically. Therefore, by modeling the piece-wise constant function by Hidden Markov Models (HMM) and approximating the second order differential equation by the difference equation, we estimate model parameters optimally based on iteration of Viterbi training and an LPC-like solver. Our representation is a generative model and can identify both the target musical note sequence and the dynamics of singing behaviors included in the F0 contours. Our experimental results show that the proposed method can separate the dynamics from the target musical note sequence and generate the F0 contours using estimated model parameters.
[1]
Hideki Kawahara,et al.
YIN, a fundamental frequency estimator for speech and music.
,
2002,
The Journal of the Acoustical Society of America.
[2]
Ning Hu,et al.
A comparative evaluation of search techniques for query-by-humming using the MUSART testbed
,
2007,
J. Assoc. Inf. Sci. Technol..
[3]
Masashi Unoki,et al.
Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis
,
2005,
Speech Commun..
[4]
Maarten Grachten,et al.
Melodic Similarity: Looking for a Good Abstraction Level
,
2004,
ISMIR.
[5]
Masataka Goto,et al.
A Stochastic Representation of the Dynamics of Sung Melody
,
2007,
ISMIR.
[6]
J. Bonada,et al.
Synthesis of the Singing Voice by Performance Sampling and Spectral Models
,
2007,
IEEE Signal Processing Magazine.