Identifying articulatory goals from kinematic data using principal differential analysis

Articulatory goals can be highly indicative of lexical intentions, but are rarely used in speech classification tasks. In this paper we show that principal differential analysis can be used to learn the behaviours of articulatory motions associated with certain high-level articulatory goals. This method accurately learns the parameters of second-order differential systems applied to data derived by electromagnetic articulography. On average, this approach is between 4.4% and 21.3% more accurate than an HMM and a neural network baseline.

[1]  Noam Chomsky,et al.  The Sound Pattern of English , 1968 .

[2]  Simon King,et al.  Articulatory feature recognition using dynamic Bayesian networks , 2007, Comput. Speech Lang..

[3]  Simon King,et al.  Speech production knowledge in automatic speech recognition. , 2007, The Journal of the Acoustical Society of America.

[4]  Simon King,et al.  Modelling the uncertainty in recovering articulation from acoustics , 2003, Comput. Speech Lang..

[5]  Katrin Kirchhoff,et al.  Robust speech recognition using articulatory information , 1998 .

[6]  Jianwu Dang,et al.  Integration of articulatory and spectrum features based on the hybrid HMM/BN modeling framework , 2006, Speech Commun..

[7]  Elliot Saltzman,et al.  Task Dynamic Coordination of the Speech Articulators: A Preliminary Model , 1986 .

[8]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[9]  Douglas D. O'Shaughnessy Speech Communications: Human and Machine , 2012 .

[10]  Roger K. Moore,et al.  Towards capturing fine phonetic variation in speech using articulatory features , 2007, Speech Commun..

[11]  Alan Wrench,et al.  Continuous speech recognition using articulatory data , 2000, INTERSPEECH.

[12]  Yana Yunusova,et al.  Accuracy assessment for AG500, electromagnetic articulograph. , 2009, Journal of speech, language, and hearing research : JSLHR.

[13]  H. Heuer,et al.  Generation and modulation of action patterns , 1986 .

[14]  Mirjam Wester Syllable classification using articulatory-acoustic features , 2003, INTERSPEECH.

[15]  Frank Rudzicz,et al.  Adaptive Kernel Canonical Correlation Analysis for Estimation of Task Dynamics from Acoustics , 2010, ICASSP.

[16]  L Saltzman Elliot,et al.  A Dynamical Approach to Gestural Patterning in Speech Production , 1989 .

[17]  Hervé Bourlard,et al.  Speech recognition with auxiliary information , 2004, IEEE Transactions on Speech and Audio Processing.

[18]  T. Auton Applied Functional Data Analysis: Methods and Case Studies , 2004 .

[19]  L. Fadiga,et al.  The Motor Somatotopy of Speech Perception , 2009, Current Biology.

[20]  James O. Ramsay,et al.  Applied Functional Data Analysis: Methods and Case Studies , 2002 .

[21]  Simon King,et al.  Articulatory Feature-Based Methods for Acoustic and Audio-Visual Speech Recognition: Summary from the 2006 JHU Summer workshop , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[22]  Keiichi Tokuda,et al.  Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model , 2008, Speech Commun..

[23]  Takashi Fukuda,et al.  Distinctive phonetic feature extraction for robust speech recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..