Transferring Vocal Expression of F0 Contour Using Singing Voice Synthesizer

A system for transferring vocal expressions separately from singing voices with accompaniment to singing voice synthesizers is described. The expressions appear as fluctuations in the fundamental frequency contour of the singing voice, such as vibrato, glissando, and kobushi. The fundamental frequency contour of the singing voice is estimated using the subharmonic summation in a limited frequency range and aligned temporally to chromatic pitch sequence. Each expression is transcribed and parameterized in accordance with designed rules. Finally, the expressions are transferred to given scores on the singing voice synthesizer. Experiments demonstrated that the proposed system can transfer the vocal expressions while retaining singer's individuality on two singing voice synthesizers: the Vocaloid and the CeVIO.

[1]  Hideki Kenmochi,et al.  VOCALOID - commercial singing synthesizer based on sample concatenation , 2007, INTERSPEECH.

[2]  Maria Cristina Jackson-Menaldi,et al.  Influence of emotional expression, loudness, and gender on the acoustic parameters of vibrato in classical singers. , 2012, Journal of voice : official journal of the Voice Foundation.

[3]  Tetsuya Ogata,et al.  Changing timbre and phrase in existing musical performances as you like: manipulations of single part using harmonic and inharmonic models , 2009, ACM Multimedia.

[4]  Hirokazu Kameoka,et al.  A Stochastic Model of Singing Voice F0 Contours for Characterizing Expressive Dynamic Components , 2012, INTERSPEECH.

[5]  Masataka Goto,et al.  Vocalistener2: A singing synthesis system able to mimic a user's singing in terms of voice timbre changes as well as pitch and dynamics , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Masataka Goto,et al.  An automatic singing skill evaluation method for unknown melodies using pitch interval accuracy and vibrato features , 2006, INTERSPEECH.

[7]  Masataka Goto,et al.  Acoustic and perceptual effects of vocal training in amateur male singing , 2009, INTERSPEECH.

[8]  D. J. Hermes,et al.  Measurement of pitch by subharmonic summation. , 1988, The Journal of the Acoustical Society of America.

[9]  Haizhou Li,et al.  Generalized F0 modelling with absolute and relative pitch features for singing voice synthesis , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Makoto Tachibana,et al.  A singing style modeling system for singing voice synthesizers , 2010, INTERSPEECH.

[11]  Alicja Wieczorkowska,et al.  Music Information Retrieval , 2009, Encyclopedia of Data Warehousing and Mining.

[12]  Judith C. Brown Calculation of a constant Q spectral transform , 1991 .

[13]  Cham Athwal,et al.  Fundamental Frequency Modulation in Singing Voice Synthesis , 2011, CMMR/FRSM.

[14]  Voice DB Song GENERATING SINGING VOICE EXPRESSION CONTOURS BASED ON UNIT SELECTION , 2013 .