论文信息 - Fundamental Frequency Modulation in Singing Voice Synthesis

Fundamental Frequency Modulation in Singing Voice Synthesis

A model is presented for the analysis and synthesis of low frequency human-like pitch deviation, as a replacement for existing modulation techniques in singing voice synthesis systems. Fundamental frequency (f0) measurements are taken from vocalists producing a selected range of utterances without vibrato and trends in the data are observed. A probabilistic function that provides natural sounding low frequency f0 modulation to synthesized singing voices is presented and the perceptual relevance is evaluated with subjective listening tests.

Cham Athwal | Ryan Stables | Jamie Bullock

[1] Wen-Hsing Lai. F0 Control Model for Mandarin Singing Voice Synthesis , 2007, 2007 Second International Conference on Digital Telecommunications (ICDT'07).

[2] Cham Athwal,et al. The Humanisation of Stochastic Processes for the Modelling of F0 Drift in Singing , 2011 .

[3] Jean Schoentgen. Estimation of the modulation frequency and modulation depth of the fundamental frequency owing to vocal micro-tremor of the voice source signal , 2001, INTERSPEECH.

[4] Marc Schröder,et al. Emotional speech synthesis: a review , 2001, INTERSPEECH.

[5] C. Larson,et al. Voice F0 responses to pitch-shifted auditory feedback: a preliminary study. , 1997, Journal of voice : official journal of the Voice Foundation.

[6] Jody Kreiman,et al. Perception of aperiodicity in pathological voice. , 2005, The Journal of the Acoustical Society of America.

[7] Ciara Leydon,et al. The role of auditory feedback in sustaining vocal vibrato. , 2003, The Journal of the Acoustical Society of America.

[8] Mark A. Clements,et al. Concatenation-Based MIDI-to-Singing Voice Synthesis , 1997 .

[9] Thomas Shipp. Science of the Singing Voice, Johan Sundberg, Northern Illinois University Press, De Kalb, 216 pp., $14.00. , 1990 .

[10] Hideki Kenmochi,et al. VOCALOID - commercial singing synthesizer based on sample concatenation , 2007, INTERSPEECH.

[11] Alex Loscos. Spectral processing of the singing voice , 2007 .

[12] D. Ruinskiy,et al. Stochastic models of pitch jitter and amplitude shimmer for voice modification , 2008, 2008 IEEE 25th Convention of Electrical and Electronics Engineers in Israel.

[13] E. Thomas Doherty,et al. Physiologic factors in vocal vibrato , 1988 .

[14] Masashi Unoki,et al. Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis , 2005, Speech Commun..

[15] Johan Sundberg,et al. The effect of delayed auditory feedback on vocal vibrato , 1988 .

[16] R. Orlikoff,et al. Fundamental frequency modulation of the human voice by the heartbeat: preliminary results and possible mechanisms. , 1989, The Journal of the Acoustical Society of America.

[17] Analysis and simulation of small variations in the fundamental frequency of sustained vowels , 1989 .

[18] Jordi Janer,et al. TRANSFORMING SINGING VOICE EXPRESSION - THE SWEETNESS EFFECT , 2004 .

[19] Cham Athwal,et al. Towards a Model for the Humanisation of Pitch Drift in Singing Voice Synthesis , 2011, ICMC.

[20] I. Stravinsky,et al. SYNTHESIS AND PROCESSING OF THE SINGING VOICE , 2002 .

[21] J. Bonada,et al. Synthesis of the Singing Voice by Performance Sampling and Spectral Models , 2007, IEEE Signal Processing Magazine.

[22] Xuejing Sun,et al. Pitch determination and voice quality analysis using Subharmonic-to-Harmonic Ratio , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[23] Masataka Goto,et al. VOCALISTENER: A SINGING-TO-SINGING SYNTHESIS SYSTEM BASED ON ITERATIVE PARAMETER ESTIMATION , 2009 .

[24] Ian Williams,et al. Perceptually Relevant Models for Articulation in Synthesised Drum Patterns , 2011 .

[25] Perry R. Cook,et al. Identification Of Control Parameters In An Articulatory Vocal Tract Model, With Applications To The Synthesis Of Singing , 1990 .

[26] T A Burnett,et al. Comparison of voice F0 responses to pitch-shift onset and offset conditions. , 2001, The Journal of the Acoustical Society of America.

[27] D. Klatt,et al. Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.