Estimating the position of mistracked coil of EMA data using GMM-based methods

Kinematic arti dilatory data are important for researches of speech production, articulatory speech synthesis, robust speech recognition, and speech inversion. Electromagnetic Articulograph (EMA) is a widely used instrument for collecting kinematic articulatory data. However, in EMA experiment, one or more coils attached to articulators are possible to be mistracked due to various reasons. To make full use of the EMA data, we attempt to reconstruct the location of mistracked coils with the methods based on Gaussian Mixture Model (GMM). These methods approximate the probability density function of the positions for the concerned coil given the positions of the other coils, then elaborating regression functions by using Minimum Mean Square Error (MMSE) and Maximum Likelihood (ML) methods. The results indicate that: i.) The positions of mistracked coils could be reconstructed from the positions of correctly tracked coils with the RMSE between 1mm and 1.5mm; ii.) The performance can be further improved by incorporating the velocity information in most cases.

[1]  Miguel Á. Carreira-Perpiñán,et al.  Reconstructing the full tongue contour from EMA/X-ray microbeam , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Korin Richmond,et al.  Estimating articulatory parameters from the acoustic speech signal , 2002 .

[3]  Ren-Hua Wang,et al.  Integrating Articulatory Features Into HMM-Based Parametric Speech Synthesis , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Jianwu Dang,et al.  Integration of articulatory and spectrum features based on the hybrid HMM/BN modeling framework , 2006, Speech Commun..

[5]  P. Hoole,et al.  On the lingual organization of the German vowel system. , 1999, The Journal of the Acoustical Society of America.

[6]  Keiichi Tokuda,et al.  Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).