The geometry of the articulatory region that produces a speech sound

It is known that some speech sounds are produced by more than a single vocal tract shape. Here, we study to what extent individual articulators (e.g. the tongue tip) are constrained by a given acoustic frame. We use parametric and nonparametric methods for articulatory inversion and quantify the error incurred by inversion methods, and the dimensionality and multimodality of the inverse region in articulatory space that corresponds to a speech sound.

[1]  Miguel Á. Carreira-Perpiñán,et al.  Mode-Finding for Mixtures of Gaussian Distributions , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Sam T. Roweis,et al.  Data-driven production models for speech processing , 1999 .

[3]  Man Mohan Sondhi,et al.  Techniques for estimating vocal-tract shapes from the speech signal , 1994, IEEE Trans. Speech Audio Process..

[4]  Pascal Perrier,et al.  The geometric vocal tract variables controlled for vowel production: proposals for constraining acoustic-to-articulatory inversion , 1992 .

[5]  Raymond D. Kent,et al.  X‐ray microbeam speech production database , 1990 .

[6]  V. Gracco,et al.  Accurate recovery of articulator positions from acoustics: new conclusions based on human data. , 1996, The Journal of the Acoustical Society of America.

[7]  Miguel Á. Carreira-Perpiñán,et al.  A comparison of acoustic features for articulatory inversion , 2007, INTERSPEECH.

[8]  Korin Richmond Estimating velum height from acoustics during continuous speech , 1999, EUROSPEECH.

[9]  Korin Richmond,et al.  Estimating articulatory parameters from the acoustic speech signal , 2002 .

[10]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[11]  Miguel Á. Carreira-Perpiñán,et al.  An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping , 2007, INTERSPEECH.

[12]  Miguel Á. Carreira-Perpiñán,et al.  Reconstruction of Sequential Data with Probabilistic Models and Continuity Constraints , 1999, NIPS.

[13]  G Papcun,et al.  Inferring articulation and recognizing gestures from acoustics with a neural network trained on x-ray microbeam data. , 1992, The Journal of the Acoustical Society of America.

[14]  Jonas Beskow,et al.  Recent Developments In Facial Animation: An Inside View , 1998, AVSP.

[15]  Hynek Hermansky,et al.  The effective second formant F2' and the vocal tract front-cavity , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[16]  B. Atal,et al.  Inversion of articulatory-to-acoustic transformation in the vocal tract by a computer-sorting technique. , 1978, The Journal of the Acoustical Society of America.