论文信息 - Probabilistic Dialogue Modeling for Speech-Enabled Assistive Technology

Probabilistic Dialogue Modeling for Speech-Enabled Assistive Technology

People with motor disabilities often face substantial challenges using interfaces designed for manual interaction. Although such obstacles might be partially alleviated by automatic speech recognition, these individuals may also have cooccurring speech-language challenges that result in high recognition error rates. In this paper, we investigate how augmenting speech applications with dialogue interaction can improve systemperformance among such users. Weconstruct an end-to-end spoken dialogue system for our target users, adult wheelchair users with multiple sclerosis and other progressive neurological conditions in a specialized-care residence, to access information and communication services through speech. We use boosting to discriminatively learn meaningful confidence scores and ask confirmation questions within a partially observable Markov decision process (POMDP) framework. Among our target users, the POMDP dialogue manager significantly increased the number of successfully completed dialogues (out of 20 dialogue tasks) compared to a baseline threshold-based strategy (p = 0.02). The reduction in dialogue completion times was more pronounced among speakers with higher error rates, illustrating the benefits of probabilistic dialogue modeling for our target population. Index Terms: spoken dialogue systems, speech interfaces, POMDPs

[1] Frank Rudzicz,et al. Comparing speaker-dependent and speaker-adaptive acoustic models for recognizing dysarthric speech , 2007, Assets '07.

[2] Milica Gasic,et al. The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management , 2010, Comput. Speech Lang..

[3] Lena Hartelius,et al. Prevalence and Characteristics of Dysarthria in a Multiple-Sclerosis Incidence Cohort: Relation to Neurological Data , 2000, Folia Phoniatrica et Logopaedica.

[4] James R. Glass. A probabilistic framework for segment-based speech recognition , 2003, Comput. Speech Lang..

[5] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[6] Alex Mihailidis,et al. Navigation and obstacle avoidance help (NOAH) for older adults with cognitive impairment: a pilot study , 2011, ASSETS.

[7] Jaime Valls Miró,et al. POMDP-based long-term user intention prediction for wheelchair navigation , 2008, 2008 IEEE International Conference on Robotics and Automation.

[8] Joelle Pineau,et al. Spoken Dialogue Management Using Probabilistic Reasoning , 2000, ACL.

[9] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[10] Joelle Pineau,et al. SmartWheeler: A Robotic Wheelchair Test-Bed for Investigating New Models of Human-Robot Interaction , 2007, AAAI Spring Symposium: Multidisciplinary Collaboration for Socially Assistive Robotics.

[11] Pascal Poupart,et al. Partially Observable Markov Decision Processes with Continuous Observations for Dialogue Management , 2008, SIGDIAL.

[12] A. Mihailidis,et al. Difficulties in Automatic Speech Recognition of Dysarthric Speakers and Implications for Speech-Based Applications Used by the Elderly: A Literature Review , 2010, Assistive technology : the official journal of RESNA.

[13] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[14] Steve J. Young,et al. Partially observable Markov decision processes for spoken dialog systems , 2007, Comput. Speech Lang..

[15] Jesse Hoey,et al. Assisting persons with dementia during handwashing using a partially observable Markov decision process. , 2007, ICVS 2007.

[16] Mark Hasegawa-Johnson,et al. Acoustic model adaptation using in-domain background models for dysarthric speech recognition , 2013, Comput. Speech Lang..