Fundamental Frequency Modulation in Singing Voice Synthesis

A model is presented for the analysis and synthesis of low frequency human-like pitch deviation, as a replacement for existing modulation techniques in singing voice synthesis systems. Fundamental frequency (f0) measurements are taken from vocalists producing a selected range of utterances without vibrato and trends in the data are observed. A probabilistic function that provides natural sounding low frequency f0 modulation to synthesized singing voices is presented and the perceptual relevance is evaluated with subjective listening tests.

[1]  Wen-Hsing Lai F0 Control Model for Mandarin Singing Voice Synthesis , 2007, 2007 Second International Conference on Digital Telecommunications (ICDT'07).

[2]  Cham Athwal,et al.  The Humanisation of Stochastic Processes for the Modelling of F0 Drift in Singing , 2011 .

[3]  Jean Schoentgen Estimation of the modulation frequency and modulation depth of the fundamental frequency owing to vocal micro-tremor of the voice source signal , 2001, INTERSPEECH.

[4]  Marc Schröder,et al.  Emotional speech synthesis: a review , 2001, INTERSPEECH.

[5]  C. Larson,et al.  Voice F0 responses to pitch-shifted auditory feedback: a preliminary study. , 1997, Journal of voice : official journal of the Voice Foundation.

[6]  Jody Kreiman,et al.  Perception of aperiodicity in pathological voice. , 2005, The Journal of the Acoustical Society of America.

[7]  Ciara Leydon,et al.  The role of auditory feedback in sustaining vocal vibrato. , 2003, The Journal of the Acoustical Society of America.

[8]  Mark A. Clements,et al.  Concatenation-Based MIDI-to-Singing Voice Synthesis , 1997 .

[9]  Thomas Shipp Science of the Singing Voice, Johan Sundberg, Northern Illinois University Press, De Kalb, 216 pp., $14.00. , 1990 .

[10]  Hideki Kenmochi,et al.  VOCALOID - commercial singing synthesizer based on sample concatenation , 2007, INTERSPEECH.

[11]  Alex Loscos Spectral processing of the singing voice , 2007 .

[12]  D. Ruinskiy,et al.  Stochastic models of pitch jitter and amplitude shimmer for voice modification , 2008, 2008 IEEE 25th Convention of Electrical and Electronics Engineers in Israel.

[13]  E. Thomas Doherty,et al.  Physiologic factors in vocal vibrato , 1988 .

[14]  Masashi Unoki,et al.  Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis , 2005, Speech Commun..

[15]  Johan Sundberg,et al.  The effect of delayed auditory feedback on vocal vibrato , 1988 .

[16]  R. Orlikoff,et al.  Fundamental frequency modulation of the human voice by the heartbeat: preliminary results and possible mechanisms. , 1989, The Journal of the Acoustical Society of America.

[17]  Analysis and simulation of small variations in the fundamental frequency of sustained vowels , 1989 .

[18]  Jordi Janer,et al.  TRANSFORMING SINGING VOICE EXPRESSION - THE SWEETNESS EFFECT , 2004 .

[19]  Cham Athwal,et al.  Towards a Model for the Humanisation of Pitch Drift in Singing Voice Synthesis , 2011, ICMC.

[20]  I. Stravinsky,et al.  SYNTHESIS AND PROCESSING OF THE SINGING VOICE , 2002 .

[21]  J. Bonada,et al.  Synthesis of the Singing Voice by Performance Sampling and Spectral Models , 2007, IEEE Signal Processing Magazine.

[22]  Xuejing Sun,et al.  Pitch determination and voice quality analysis using Subharmonic-to-Harmonic Ratio , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[23]  Masataka Goto,et al.  VOCALISTENER: A SINGING-TO-SINGING SYNTHESIS SYSTEM BASED ON ITERATIVE PARAMETER ESTIMATION , 2009 .

[24]  Ian Williams,et al.  Perceptually Relevant Models for Articulation in Synthesised Drum Patterns , 2011 .

[25]  Perry R. Cook,et al.  Identification Of Control Parameters In An Articulatory Vocal Tract Model, With Applications To The Synthesis Of Singing , 1990 .

[26]  T A Burnett,et al.  Comparison of voice F0 responses to pitch-shift onset and offset conditions. , 2001, The Journal of the Acoustical Society of America.

[27]  D. Klatt,et al.  Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.