Morphological and acoustic analysis of the vocal tract using a multi-speaker volumetric MRI dataset

The shape of the vocal tract was analyzed from both morphological and acoustic perspectives for ten male speakers of Japanese. A volumetric MRI (magnetic resonance imaging) measurement was performed while each speaker uttered each of the five Japanese vowels. The cross-sectional vocal-tract area function was computed from the MRI dataset and the resulting 50 vocal-tract shapes were analyzed statistically to determine the principal deformation patterns. A perturbation of the vocaltract shape was then given for each vowel to examine the effect on the first and second formant frequencies. When the perturbation was given by changing the coefficient values of the first and second principal modes, a local region on the coefficient plane was observed where the formant change was small. In other words, this region was acoustically insensitive to the perturbation of the vocal-tract shape. When the vocal-tract shapes of the ten speakers were marked on the same plot, it was also found that marked vocal-tract shapes were located in the vicinity of the acoustically insensitive region. From these numerical investigations, it was considered how the individual differences in the vocal-tract shape can be resolved to generate phonetically relevant speech sounds.

[1]  W. Marsden I and J , 2012 .

[2]  J. Rokkaku,et al.  Measurements of the three-dimensional shape of the vocal tract based on the magnetic resonance imaging technique , 1986 .

[3]  Man Mohan Sondhi,et al.  A hybrid time-frequency domain articulatory speech synthesizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[4]  J. Flanagan,et al.  Synthesis of voiced sounds from a two-mass model of the vocal cords , 1972 .

[5]  Brad H. Story,et al.  Parameterization of vocal tract area functions by empirical orthogonal modes , 1998 .

[6]  J. Flanagan Speech Analysis, Synthesis and Perception , 1971 .

[7]  E. Hoffman,et al.  Vocal tract area functions from magnetic resonance imaging. , 1996, The Journal of the Acoustical Society of America.

[8]  Kiyoshi Honda,et al.  Principal components of vocal-tract area functions and inversion of vowels by linear regression of cepstrum coefficients , 2007, J. Phonetics.

[9]  Paul Strauss,et al.  Clinical Measurement Of Speech And Voice , 2016 .

[10]  I. Titze Nonlinear source-filter coupling in phonation: theory. , 2008, The Journal of the Acoustical Society of America.

[11]  J. Švec,et al.  Comparison of biomechanical modeling of register transitions and voice instabilities with excised larynx experiments. , 2007, The Journal of the Acoustical Society of America.

[12]  M. Schroeder Determination of the geometry of the human vocal tract by acoustic measurements. , 1967, The Journal of the Acoustical Society of America.

[13]  P. W. Nye,et al.  Analysis of vocal tract shape and dimensions using magnetic resonance imaging: vowels. , 1991, The Journal of the Acoustical Society of America.

[14]  Tokihiko Kaburagi,et al.  Voice production model integrating boundary-layer analysis of glottal flow and source-filter coupling. , 2011, The Journal of the Acoustical Society of America.

[15]  Kiyoshi Honda,et al.  A method of tooth superimposition on MRI data for accurate measurement of vocal tract shape and dimensions , 2004 .

[16]  Brad H Story Vocal tract modes based on multiple area function sets from one speaker. , 2009, The Journal of the Acoustical Society of America.