Analysis-by-Performance: Gesturally-Controlled Voice Synthesis as an Input for Modelling of Vibrato in Singing

In this paper we introduce Analysis-by-Performance, a new methodology for addressing several signal modelling issues. This approach studies the gestural behaviour of a performer, while he/she is imitating a given sound effect with an appropriate digital musical instrument. New insights observed in the performing gestures eventually lead to a new sound production model. The Analysis-byPerformance technique is applied to the study of vibrato in singing. Indeed for several years the HANDKSETCH digital instrument gave performers the ability to imitate vibrato in singing in a highly natural and expressive way. Results from gestural analysis of the HANDKSETCH practice are presented and a new vibrato model based on glottal flow parameters is proposed.

[1]  Antonio Bonafonte,et al.  TC-STAR : Evaluation Plan for Voice Conversion Technology , 2009 .

[2]  J. Sundberg,et al.  The Science of Singing Voice , 1987 .

[3]  Thierry Dutoit,et al.  HandSketch bi-manual controller: investigation on expressive control issues of an augmented tablet , 2007, NIME '07.

[4]  Marcelo M. Wanderley,et al.  Gestural control of sound synthesis , 2004, Proceedings of the IEEE.

[5]  P. Depalle,et al.  Perceptual Evaluation of Vibrato Models , 2005 .

[6]  Shrikanth S. Narayanan,et al.  Expressive speech synthesis using a concatenative synthesizer , 2002, INTERSPEECH.

[7]  Christophe d'Alessandro,et al.  Zeros of Z-transform representation with application to source-filter separation in speech , 2005, IEEE Signal Processing Letters.

[8]  Abeer Alwan,et al.  An improved correction formula for the estimation of harmonic magnitudes and its application to open quotient estimation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  Marcelo M. Wanderley,et al.  The Importance of Parameter Mapping in Electronic Instrument Design , 2002, NIME.

[10]  Gunnar Fant,et al.  Acoustic Theory Of Speech Production , 1960 .

[11]  B. Doval,et al.  On the use of the derivative of electroglottographic signals for characterization of nonpathological phonation. , 2004, The Journal of the Acoustical Society of America.

[12]  Baris Bozkurt,et al.  RAMCESS 2.X framework—expressive voice analysis for realtime and accurate synthesis of singing , 2008, Journal on Multimodal User Interfaces.

[13]  J. Beauchamp,et al.  An investigation of vocal vibrato for synthesis , 1990 .

[14]  Christophe d'Alessandro,et al.  Just noticeable differences of open quotient and asymmetry coefficient in singing voice. , 2003, Journal of voice : official journal of the Voice Foundation.

[15]  Ingo R Titze,et al.  A reflex resonance model of vocal vibrato. , 2002, The Journal of the Acoustical Society of America.

[16]  K. Stevens,et al.  Reduction of Speech Spectra by Analysis‐by‐Synthesis Techniques , 1961 .

[17]  Dimitrios Tzovaras Multimodal user interfaces : from signals to interaction , 2008 .

[18]  Sidney S. Fels,et al.  Intimacy and embodiment: implications for art and technology , 2000, MULTIMEDIA '00.

[19]  Antonio Bonafonte,et al.  Including dynamic and phonetic information in voice conversion systems , 2004, INTERSPEECH.

[20]  F. Richard Moore,et al.  The Dysfunctions of MIDI , 1988, ICMC.

[21]  Masataka Goto,et al.  Speech-to-Singing Synthesis: Converting Speaking Voices to Singing Voices by Controlling Acoustic Features Unique to Singing Voices , 2007, 2007 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.