Maximum a-posteriori estimation of missing samples with continuity constraint in Electromagnetic Articulography data

Electromagnetic Articulography (EMA) technique is used to record the kinematics of different articulators while one speaks. EMA data often contains missing segments due to sensor failure. In this work, we propose a maximum a-posteriori (MAP) estimation with continuity constraint to recover the missing samples in the articulatory trajectories recorded using EMA. In this approach, we combine the benefits of statistical MAP estimation as well as the temporal continuity of the articulatory trajectories. Experiments on articulatory corpus using different missing segment durations show that the proposed continuity constraint results in a 30% reduction in average root mean squared error in estimation over statistical estimation of missing segments without any continuity constraint.

[1]  Raymond D. Kent,et al.  X‐ray microbeam speech production database , 1990 .

[2]  Gene H. Golub,et al.  Missing value estimation for DNA microarray gene expression data: local least squares imputation , 2005, Bioinform..

[3]  Haibo Wang,et al.  Estimating the position of mistracked coil of EMA data using GMM-based methods , 2013, 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.

[4]  Miguel Á. Carreira-Perpiñán,et al.  Estimating missing data sequences in x-ray microbeam recordings , 2010, INTERSPEECH.

[5]  Shrikanth Narayanan,et al.  A generalized smoothness criterion for acoustic-to-articulatory inversion. , 2010, The Journal of the Acoustical Society of America.

[6]  M. Stone A three-dimensional model of tongue movement based on ultrasound and x-ray microbeam data. , 1990, The Journal of the Acoustical Society of America.

[7]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[8]  Ao Li,et al.  Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme , 2006, BMC Bioinformatics.

[9]  Keiichi Tokuda,et al.  Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model , 2008, Speech Commun..

[10]  R.M. Stern,et al.  Missing-feature approaches in speech recognition , 2005, IEEE Signal Processing Magazine.

[11]  Alan A Wrench,et al.  A MULTI-CHANNEL/MULTI-SPEAKER ARTICULATORY DATABASE FOR CONTINUOUS SPEECH RECOGNITION RESEARCH , 2000 .

[12]  Phil D. Green,et al.  Robust automatic speech recognition with missing and unreliable acoustic data , 2001, Speech Commun..

[13]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[14]  Yoon-Chul Kim,et al.  Seeing speech: Capturing vocal tract shaping using real-time magnetic resonance imaging [Exploratory DSP] , 2008, IEEE Signal Processing Magazine.

[15]  Ian Marshall,et al.  The magnetic resonance imaging subset of the mngu0 articulatory corpus. , 2012, The Journal of the Acoustical Society of America.

[16]  Shrikanth S. Narayanan,et al.  On smoothing articulatory trajectories obtained from Gaussian mixture model based acoustic-to-articulatory inversion. , 2013, The Journal of the Acoustical Society of America.

[17]  Marie-Odile Berger,et al.  Extraction of tongue contours in X-ray images with minimal user interaction , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[18]  P. Schönle,et al.  Electromagnetic articulography: Use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract , 1987, Brain and Language.