Rule-based voice quality variation with formant synthesis

We describe an approach to simulate different phonation types, following John Laver’s terminology, by means of a hybrid (rulebased and unit concatenating) formant synthesizer. Different voice qualities were generated by following hints from the literature and applying the revised KLGLOTT88 model. Within a listener perception experiment, we show that the phonation types get distinguished by the listeners and lead to emotional impression as predicted by literature. The synthesis system and its source code, as well as audio samples can be downloaded at http://emoSyn.syntheticspeech.de/.

[1]  Christer Gobl,et al.  Acoustic characteristics of voice quality , 1992, Speech Commun..

[2]  Yoshinori Sagisaka,et al.  ATR μ-talk speech synthesis system , 1992, ICSLP.

[3]  Gunnar Fant,et al.  Acoustic Theory Of Speech Production , 1960 .

[4]  Inger Karlsson Modelling voice variations in female speech synthesis , 1992, Speech Commun..

[5]  J. Liljencrants,et al.  Dept. for Speech, Music and Hearing Quarterly Progress and Status Report a Four-parameter Model of Glottal Flow , 2022 .

[6]  Rolf Carlson,et al.  Data-driven formant synthesis , 2004 .

[7]  Dik J. Hermes,et al.  Synthesis of breathy vowels: Some research methods , 1991, Speech Commun..

[8]  Felix Burkhardt,et al.  Simulation emotionaler Sprechweise mit Sprachsyntheseverfahren , 2000 .

[9]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[10]  Rolf Carlson,et al.  Experiments with voice modelling in speech synthesis , 1991, Speech Commun..

[11]  D. Klatt,et al.  Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[12]  J. Kreiman,et al.  SOURCE MODEL ADEQUACY FOR PATHOLOGICAL VOICE SYNTHESIS , 1999 .

[13]  J. Laver The phonetic description of voice quality , 1980 .

[14]  Ailbhe Ní Chasaide,et al.  The role of voice quality in communicating emotion, mood and attitude , 2003, Speech Commun..

[15]  W. Sendlmeier,et al.  Verification of acoustical correlates of emotional speech using formant-synthesis , 2000 .

[16]  Dj Dik Hermes Synthesis of breathy vowels , 1990 .