The TORGO database of acoustic and articulatory speech from speakers with dysarthria

This paper describes the acquisition of a new database of dysarthric speech in terms of aligned acoustics and articulatory data. This database currently includes data from seven individuals with speech impediments caused by cerebral palsy or amyotrophic lateral sclerosis and age- and gender-matched control subjects. Each of the individuals with speech impediments are given standardized assessments of speech-motor function by a speech-language pathologist. Acoustic data is obtained by one head-mounted and one directional microphone. Articulatory data is obtained by electromagnetic articulography, which allows the measurement of the tongue and other articulators during speech, and by 3D reconstruction from binocular video sequences. The stimuli are obtained from a variety of sources including the TIMIT database, lists of identified phonetic contrasts, and assessments of speech intelligibility. This paper also includes some analysis as to how dysarthric speech differs from non-dysarthric speech according to features such as length of phonemes, and pronunciation errors.

[1]  John-Paul Hosom,et al.  Improving the intelligibility of dysarthric speech , 2007, Speech Commun..

[2]  James Henman,et al.  Concurrent Validity of the Peabody Picture Vocabulary Test, Draw-a-Man, and Children's Embedded Figures Test With Four Year Old Children , 1972 .

[3]  P. Enderby,et al.  Frenchay Dysarthria Assessment , 1983 .

[4]  Raymond D. Kent,et al.  X‐ray microbeam speech production database , 1990 .

[5]  John-Paul Hosom,et al.  Intelligibility of modifications to dysarthric speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[6]  Yana Yunusova,et al.  Accuracy assessment for AG500, electromagnetic articulograph. , 2009, Journal of speech, language, and hearing research : JSLHR.

[7]  G Jayaram,et al.  Experiments in dysarthric speech recognition using artificial neural networks. , 1995, Journal of rehabilitation research and development.

[8]  R. Herndon Handbook of Neurologic Rating Scales , 1997 .

[9]  Thomas S. Huang,et al.  Dysarthric speech database for universal access research , 2008, INTERSPEECH.

[10]  D R Beukelman,et al.  Communication efficiency of dysarthric speakers as measured by sentence intelligibility and speaking rate. , 1981, The Journal of speech and hearing disorders.

[11]  Yves Laprie,et al.  Measurement Accuracy in 3D Electromagnetic Articulography (Carstens AG500) , 2008 .

[12]  Masaaki Honda,et al.  Three-dimensional electromagnetic articulography: a measurement principle. , 2005, The Journal of the Acoustical Society of America.

[13]  Philip Hoole,et al.  Beyond 2D in articulatory data acquisition and analysis , 2003 .

[14]  Andreas Zierdt,et al.  DEVELOPMENT OF A SYSTEM FOR THREE-DIMENSIONAL FLESHPOINT MEASUREMENT OF SPEECH MOVEMENTS , 1999 .

[15]  Herbert A. Leeper,et al.  A Description of Phonetic, Acoustic, and Physiological Changes Associated With Improved Intelligibility in a Speaker With Spastic Dysarthria , 2001 .

[16]  Pascal van Lieshout,et al.  Suitability of a UV-based video recording system for the analysis of small facial motions during speech , 2007, Speech Commun..

[17]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[18]  Keiichi Tokuda,et al.  Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model , 2008, Speech Commun..

[19]  Frank Rudzicz,et al.  Comparing speaker-dependent and speaker-adaptive acoustic models for recognizing dysarthric speech , 2007, Assets '07.

[20]  Parham Aarabi,et al.  Phase-Based Dual-Microphone Speech Enhancement Using A Prior Speech Model , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  M. Lindstrom,et al.  Articulatory movements during vowels in speakers with dysarthria and healthy controls. , 2008, Journal of speech, language, and hearing research : JSLHR.

[22]  Roger Y. Tsai,et al.  A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses , 1987, IEEE J. Robotics Autom..

[23]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[24]  Simon King,et al.  Articulatory Feature-Based Methods for Acoustic and Audio-Visual Speech Recognition: Summary from the 2006 JHU Summer workshop , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[25]  J M Campbell,et al.  Concurrent Validity of the Peabody Picture Vocabulary Test-Third Edition As an Intelligence and Achievement Screener for Low SES African American Children , 2001, Assessment.

[26]  H. Timothy Bunnell,et al.  The Nemours database of dysarthric speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[27]  Masaaki Honda,et al.  EXTRACTING TONGUES FROM MOVING HEADS , 2000 .

[28]  Parham Aarabi,et al.  Phase-based dual-microphone robust speech enhancement , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[29]  Jeremy H. Clear,et al.  The British national corpus , 1993 .

[30]  Frank Rudzicz,et al.  Adaptive Kernel Canonical Correlation Analysis for Estimation of Task Dynamics from Acoustics , 2010, ICASSP.

[31]  Frank Rudzicz,et al.  Applying discretized articulatory knowledge to dysarthric speech , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[32]  Simon King,et al.  Modelling the uncertainty in recovering articulation from acoustics , 2003, Comput. Speech Lang..

[33]  P. Delany,et al.  The Digital Word: Text-Based Computing in the Humanities , 1993 .

[34]  R. Patel,et al.  Prosodic control in severe dysarthria: preserved ability to mark the question-statement contrast. , 2002, Journal of speech, language, and hearing research : JSLHR.

[35]  Wouter Hulstijn,et al.  Higher and lower order influences on the stability of the dynamic coupling between articulators. , 1996 .

[36]  Louis Goldstein,et al.  An Articulatory Phonology Perspective on Rhotic Articulation Problems: A Descriptive Case Study , 2008 .

[37]  Pascal H H M van Lieshout,et al.  Tongue control for speech and swallowing in healthy younger and older subjects. , 2007, The International journal of orofacial myology : official publication of the International Association of Orofacial Myology.

[38]  Raymond D. Kent,et al.  Toward phonetic intelligibility testing in dysarthria. , 1989, The Journal of speech and hearing disorders.

[39]  Jianwu Dang,et al.  Integration of articulatory and spectrum features based on the hybrid HMM/BN modeling framework , 2006, Speech Commun..

[40]  Thomas S. Huang,et al.  Hmm-Based and Svm-Based Recognition of the Speech of Talkers With Spastic Dysarthria , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[41]  Raymond D. Kent Research on speech motor control and its disorders: a review and prospective. , 2000, Journal of communication disorders.

[42]  Phil Hoole,et al.  Five-dimensional articulography , 2009 .