Articulatory synthesis of words in six voice qualities using a modified two-mass model of the vocal folds

A modified two-mass model of the vocal folds is introduced and applied to the articulatory synthesis of words in six voi ce qualities. The modified two-mass model uses mass elements that are inclined, instead of parallel, with respect to the d orsoventral axis as a function of the degree of abduction. This allows to produce the continuum of voice qualities from pressed over modal to breathy voices. Furthermore, the model is extended by a variable posterior chink to represent the space between the arytenoid cartilages, like in whispery phonation. Five words were each synthesized with different glottal settings to simulate modal voice, pressed voice, br eathy voice, whispery voice, vocal fry, and falsetto. The stim uli were judged by a group of listeners in a forced-choice experiment with respect to the perceived voice qualities. Apart from whispery voice, which was more often judged as breathy than whispery, all voice types were identified as intended with probabilities between 50% (modal voice) and 94% (falsetto), which are well above the chance level of 16.1%.

[1]  G. P. Moore,et al.  A model for vocal fold vibratory motion, contact area, and the electroglottogram. , 1986, The Journal of the Acoustical Society of America.

[2]  H. Herzel,et al.  Biomechanical modeling of register transitions and the role of vocal tract resonators. , 2010, The Journal of the Acoustical Society of America.

[3]  Jason Yu,et al.  Two-dimensional model of vocal fold vibration for sound synthesis of voice and soprano singing. , 2005, The Journal of the Acoustical Society of America.

[4]  J. L. Flanagan,et al.  Acoustic properties of longitudinal displacement in vocal cord vibration , 1977, The Bell System Technical Journal.

[5]  D G Childers,et al.  Vocal quality factors: analysis, synthesis, and perception. , 1991, The Journal of the Acoustical Society of America.

[6]  D. Klatt,et al.  Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[7]  Peter Ladefoged,et al.  Phonation types: a cross-linguistic overview , 2001, J. Phonetics.

[8]  Peter Birkholz,et al.  Simulation of Losses Due to Turbulence in the Time-Varying Vocal System , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  N. Campbell,et al.  Voice Quality : the 4 th Prosodic Dimension , 2004 .

[10]  J. Laver The phonetic description of voice quality , 1980 .

[11]  Ailbhe Ní Chasaide,et al.  The role of voice quality in communicating emotion, mood and attitude , 2003, Speech Commun..

[12]  Paavo Alku,et al.  Laryngeal voice quality in the expression of focus , 2010, INTERSPEECH.

[13]  D E Metz,et al.  Vibratory patterns of the vocal folds during pulse register phonation. , 1984, The Journal of the Acoustical Society of America.

[14]  Rnj Raymond Veldhuis,et al.  A symmetrical two-mass vocal-fold model coupled to vocal tract and trachea, with application to prosthesis design , 1998 .

[15]  I. Titze,et al.  Voice simulation with a body-cover model of the vocal folds. , 1995, The Journal of the Acoustical Society of America.

[16]  I R Titze,et al.  The Human Vocal Cords: A Mathematical Model , 1974, Phonetica.

[17]  I R Titze,et al.  The Human Vocal Cords: A Mathematical Model , 1973, Phonetica.

[18]  Ingo Titze,et al.  A four-parameter model of the glottis and vocal fold contact area , 1989, Speech Commun..

[19]  J. Flanagan,et al.  Synthesis of voiced sounds from a two-mass model of the vocal cords , 1972 .

[20]  Simulation of vocal fold oscillation behaviour by a self-oscillating glottis model , 1994 .

[21]  M. Döllinger,et al.  Biomechanical modeling of the three-dimensional aspects of human vocal fold dynamics. , 2010, The Journal of the Acoustical Society of America.

[22]  Peter Birkholz,et al.  Influence of temporal discretization schemes on formant frequencies and bandwidths in time domain simulations of the vocal tract system , 2004, INTERSPEECH.

[23]  Johan Liljencrants,et al.  A translating and rotating mass model of the vocal folds , 1991 .

[24]  P. Alku,et al.  A comparison of glottal voice source quantification parameters in breathy, normal and pressed phonation of female and male speakers. , 1996, Folia phoniatrica et logopaedica : official organ of the International Association of Logopedics and Phoniatrics.

[25]  J. Švec,et al.  Comparison of biomechanical modeling of register transitions and voice instabilities with excised larynx experiments. , 2007, The Journal of the Acoustical Society of America.

[26]  M. Ng,et al.  Acoustic, aerodynamic, physiologic, and perceptual properties of modal and vocal fry registers. , 1998, The Journal of the Acoustical Society of America.