Estimation of articulatory gesture patterns from speech acoustics

We investigated dynamic programming (DP) and statemodel (SM) approaches for estimating gestural scores from speech acoustics. We performed a word-identification task using the gestural pattern vector sequences estimated by each approach. For a set of 75 randomly chosen words, we obtained the best word-identification accuracy (66.67%) using the DP approach. This result implies that considerable support for lexical access during speech perception might be provided by such a method of recovering gestural information from acoustics. Index Terms: gestural patterns, acoustic to gesture inversion

[1]  Waveforms Hisashi Wakita Direct Estimation of the Vocal Tract Shape by Inverse Filtering of Acoustic Speech , 1973 .

[2]  M. Iacoboni,et al.  Listening to speech activates motor areas involved in speech production , 2004, Nature Neuroscience.

[3]  Louis Goldstein,et al.  Perceptuomotor compatibility effects in speech , 2009, Attention, perception & psychophysics.

[4]  John S. D. Mason,et al.  Deriving articulatory representations of speech , 1995, EUROSPEECH.

[5]  P. Ladefoged,et al.  Generating vocal tract shapes from formant frequencies. , 1978, The Journal of the Acoustical Society of America.

[6]  G Papcun,et al.  Inferring articulation and recognizing gestures from acoustics with a neural network trained on x-ray microbeam data. , 1992, The Journal of the Acoustical Society of America.

[7]  Dani Byrd,et al.  TADA: An enhanced, portable Task Dynamics model in MATLAB , 2004 .

[8]  O. Fujimura,et al.  Model for Specification of the Vocal‐Tract Area Function , 1966 .

[9]  K. Stevens,et al.  A quasiarticulatory approach to controlling acoustic source parameters in a Klatt-type formant synthesizer using HLsyn. , 2002, The Journal of the Acoustical Society of America.

[10]  C. Browman,et al.  Papers in Laboratory Phonology: Tiers in articulatory phonology, with some implications for casual speech , 1990 .

[11]  Daniel P. W. Ellis,et al.  Data-driven articulatory inversion incorporating articulator priors , 2008, SAPA@INTERSPEECH.

[12]  C. Browman,et al.  Articulatory Phonology: An Overview , 1992, Phonetica.

[13]  J. Olive,et al.  Rule synthesis of speech from dyadic units , 1977 .

[14]  M. Turvey,et al.  The motor theory of speech perception reviewed , 2006, Psychonomic bulletin & review.

[15]  Shigeru Katagiri,et al.  Inverting mappings from smooth paths through Rn to paths through Rm: A technique applied to recovering articulation from acoustics , 2007, Speech Commun..

[16]  N. Umeda,et al.  Automatic synthesis from ordinary english test , 1973 .

[17]  Louis Goldstein,et al.  A task-dynamic toolkit for modeling the effects of prosodic structure on articulation , 2008, Speech Prosody 2008.

[18]  R. Ilmoniemi,et al.  Functional links between motor and language systems , 2005, The European journal of neuroscience.

[19]  A. Liberman Some Results of Research on Speech Perception , 1957 .

[20]  Gunnar Fant,et al.  Acoustic Theory Of Speech Production , 1960 .

[21]  B. Atal,et al.  Inversion of articulatory-to-acoustic transformation in the vocal tract by a computer-sorting technique. , 1978, The Journal of the Acoustical Society of America.

[22]  Kam L. Wong Analysis or synthesis , 1985 .

[23]  A. Liberman,et al.  The motor theory of speech perception revised , 1985, Cognition.

[24]  Mark Hasegawa-Johnson,et al.  The entropy of the articulatory phonological code: recognizing gestures from tract variables , 2008, INTERSPEECH.