The 2019 Workshop on Speech and Language Processing for Assistive Technologies

Participating in conversations can be difficult for people with hearing loss, especially in acoustically challenging environments. We studied the preferences the hearing impaired have for a personal conversation assistant based on automatic speech recognition (ASR) technology. We created two prototypes which were evaluated by hearing impaired test users. This paper qualitatively compares the two based on the feedback obtained from the tests. The first prototype was a proof-of-concept system running real-time ASR on a laptop. The second prototype was developed for a mobile device with the recognizer running on a separate server. In the mobile device, augmented reality (AR) was used to help the hearing impaired observe gestures and lip movements of the speaker simultaneously with the transcriptions. Several testers found the systems useful enough to use in their daily lives, with majority preferring the mobile AR version. The biggest concern of the testers was the accuracy of the transcriptions and the lack of speaker identification.

[1]  M HYMAN,et al.  An experimental study of artificial-larynx and esophageal speech. , 1955, The Journal of speech and hearing disorders.

[2]  J. Gower Generalized procrustes analysis , 1975 .

[3]  D R Beukelman,et al.  Communication efficiency of dysarthric speakers as measured by sentence intelligibility and speaking rate. , 1981, The Journal of speech and hearing disorders.

[4]  M. Singer,et al.  A comparative acoustic study of normal, esophageal, and tracheoesophageal speech production. , 1984, The Journal of speech and hearing disorders.

[5]  B Weinberg,et al.  Artificial larynx. , 1984, The Laryngoscope.

[6]  D R Beukelman,et al.  Frequency of word occurrence in communication samples produced by adult communication aid users. , 1984, The Journal of speech and hearing disorders.

[7]  P. Schönle,et al.  Electromagnetic articulography: Use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract , 1987, Brain and Language.

[8]  J. Cedarbaum,et al.  Performance of the Amyotrophic Lateral Sclerosis Functional Rating Scale (ALSFRS) in multicenter clinical trials , 1997, Journal of the Neurological Sciences.

[9]  Michelle Mendoza,et al.  Guidelines for the use and performance of quantitative outcome measures in ALS clinical trials , 1997, Journal of the Neurological Sciences.

[10]  Sharon Glennen,et al.  The handbook of augmentative and alternative communication , 1997 .

[11]  J. Cedarbaum,et al.  The ALSFRS-R: a revised ALS functional rating scale that incorporates assessments of respiratory function , 1999, Journal of the Neurological Sciences.

[12]  A. E. Hoerl,et al.  Ridge regression: biased estimation for nonorthogonal problems , 2000 .

[13]  M. Naumann,et al.  Disease progression in amyotrophic lateral sclerosis: Predictors of survival , 2002, Muscle & nerve.

[14]  Kiyohiro Shikano,et al.  Non-audible murmur recognition input interface using stethoscopic microphone attached to the skin , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[15]  D. Basak,et al.  Support Vector Regression , 2008 .

[16]  Pedro Gómez Vilda,et al.  Use of Cepstrum-Based Parameters for Automatic Pathology Detection on Speech - Analysis of Performance and Theoretical Justification , 2008, BIOSIGNALS.

[17]  Yana Yunusova,et al.  Accuracy assessment for AG500, electromagnetic articulograph. , 2009, Journal of speech, language, and hearing research : JSLHR.

[18]  Meysam Asgari,et al.  Predicting severity of Parkinson's disease from speech , 2010, 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology.

[19]  J. M. Gilbert,et al.  Silent speech interfaces , 2010, Speech Commun..

[20]  Ted Mau,et al.  Diagnostic evaluation and management of hoarseness. , 2010, The Medical clinics of North America.

[21]  Jeffrey J Berry,et al.  Accuracy of the NDI wave speech research system. , 2011, Journal of speech, language, and hearing research : JSLHR.

[22]  O. Hardiman,et al.  Amyotrophic lateral sclerosis , 2011, The Lancet.

[23]  R. Chan,et al.  Modulating phonation through alteration of vocal fold medial surface contour , 2012, The Laryngoscope.

[24]  A. Chiò,et al.  Evidence of multidimensionality in the ALSFRS-R Scale: a critical appraisal on its measurement properties using Rasch analysis , 2013, Journal of Neurology, Neurosurgery & Psychiatry.

[25]  Jordan R. Green,et al.  Bulbar and speech motor assessment in ALS: Challenges and future directions , 2013, Amyotrophic lateral sclerosis & frontotemporal degeneration.

[26]  Visar Berisha,et al.  Towards a clinical tool for automatic intelligibility assessment , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[27]  Jun Wang,et al.  SMASH: a tool for articulatory data processing and analysis , 2013, INTERSPEECH.

[28]  Thomas F. Quatieri,et al.  Vocal and Facial Biomarkers of Depression based on Motor Incoordination and Timing , 2014, AVEC '14.

[29]  Phil D. Green,et al.  Analysis of phonetic similarity in a silent speech interface based on permanent magnetic articulography , 2014, INTERSPEECH.

[30]  A. Al-Chalabi,et al.  Estimating clinical stage of amyotrophic lateral sclerosis from the ALS Functional Rating Scale , 2014, Amyotrophic lateral sclerosis & frontotemporal degeneration.

[31]  Jonas T. Johnson Shawn D. Newlands Byron J. Bailey,et al.  Head & neck surgery--otolaryngology , 2015 .

[32]  Myung Jong Kim,et al.  Automatic Intelligibility Assessment of Dysarthric Speech Using Phonologically-Structured Sparse Linear Model , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[33]  Maysam Ghovanloo,et al.  Safety and efficacy of medically performed tongue piercing in people with tetraplegia for use with tongue-operated assistive technology. , 2015, Topics in spinal cord injury rehabilitation.

[34]  Masanori Morise,et al.  CheapTrick, a spectral envelope estimator for high-quality speech synthesis , 2015, Speech Commun..

[35]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[36]  Ahmed Hammouch,et al.  Detecting Patients with Parkinson ’ s disease using Mel Frequency Cepstral Coefficients and Support Vector Machines , 2015 .

[37]  Lukas Latacz,et al.  Automated Speech Rate Measurement in Dysarthria. , 2015, Journal of speech, language, and hearing research : JSLHR.

[38]  Jun Wang,et al.  Predicting Intelligible Speaking Rate in Individuals with Amyotrophic Lateral Sclerosis from a Small Number of Speech Acoustic and Articulatory Samples. , 2016, Workshop on Speech and Language Processing for Assistive Technologies.

[39]  J R Orozco-Arroyave,et al.  Automatic detection of Parkinson's disease in running speech spoken in three different languages. , 2016, The Journal of the Acoustical Society of America.

[40]  Masanori Morise,et al.  D4C, a band-aperiodicity estimator for high-quality speech synthesis , 2016, Speech Commun..

[41]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[42]  Jesús Francisco Vargas-Bonilla,et al.  Towards an automatic monitoring of the neurological state of Parkinson's patients from speech , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[43]  Visar Berisha,et al.  Online speaking rate estimation using recurrent neural networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[44]  Jun Wang,et al.  Towards Automatic Detection of Amyotrophic Lateral Sclerosis from Speech Acoustic and Articulatory Samples , 2016, INTERSPEECH.

[45]  Ashok Samal,et al.  An Optimal Set of Flesh Points on Tongue and Lips for Speech-Movement Classification. , 2016, Journal of speech, language, and hearing research : JSLHR.

[46]  Laurent Girin,et al.  Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces , 2016, PLoS Comput. Biol..

[47]  Phil D. Green,et al.  Direct Speech Reconstruction From Articulatory Sensor Data by Machine Learning , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[48]  Erika S. Levy,et al.  Acoustic and perceptual speech characteristics of native Mandarin speakers with Parkinson's disease. , 2017, The Journal of the Acoustical Society of America.

[49]  Gábor Gosztolya,et al.  DNN-Based Ultrasound-to-Speech Conversion for a Silent Speech Interface , 2017, INTERSPEECH.

[50]  Yana Yunusova,et al.  The diagnostic utility of patient-report and speech-language pathologists’ ratings for detecting the early onset of bulbar symptoms due to ALS , 2017, Amyotrophic lateral sclerosis & frontotemporal degeneration.

[51]  Roger K. Moore,et al.  This is a repository copy of Evaluation of a Silent Speech Interface based on Magnetic Sensing and Deep Learning for a Phonetically Rich Vocabulary , 2018 .

[52]  Myungjong Kim,et al.  Speaker-Independent Silent Speech Recognition From Flesh-Point Articulatory Movements Using an LSTM Neural Network , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[53]  Maysam Ghovanloo,et al.  Multimodal Speech Capture System for Speech Rehabilitation and Learning , 2017, IEEE Transactions on Biomedical Engineering.

[54]  Visar Berisha,et al.  Objective assessment of pathological speech using distribution regression , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[55]  Prasanna V. Kothalkar,et al.  Automatic prediction of intelligible speaking rate for individuals with ALS from speech acoustic and articulatory samples , 2018, International journal of speech-language pathology.

[56]  Maysam Ghovanloo,et al.  Preliminary Test of a Wireless Magnetic Tongue Tracking System for Silent Speech Interface , 2018, 2018 IEEE Biomedical Circuits and Systems Conference (BioCAS).

[57]  Carla Agurto,et al.  Detection of Amyotrophic Lateral Sclerosis (ALS) via Acoustic Analysis , 2018, INTERSPEECH.

[58]  Myung Jong Kim,et al.  Articulation-to-Speech Synthesis Using Articulatory Flesh Point Sensors' Orientation Information , 2018, INTERSPEECH.

[59]  Eva Zácková,et al.  Quality of life of patients after total laryngectomy: the struggle against stigmatization and social exclusion using speech synthesis , 2018, Disability and rehabilitation. Assistive technology.

[60]  Myung Jong Kim,et al.  Automatic Early Detection of Amyotrophic Lateral Sclerosis from Intelligible Speech Using Convolutional Neural Networks , 2018, INTERSPEECH.

[61]  Phil D. Green,et al.  A Wearable Silent Speech Interface based on Magnetic Sensors with Motion-Artefact Removal , 2018, BIODEVICES.

[62]  Tanja Schultz,et al.  A comparison of EMG-to-Speech Conversion for Isolated and Continuous Speech , 2018, ITG Symposium on Speech Communication.