A Quality-Focused Spoken Dialog System With Reinforcement Learning And Simulated User

In this paper, we propose a solution to the problem of formulating strategies for a spoken dialog system. Our approach is based on reinforcement learning (RL) with the help of a simulated user (SU), involving unsupervised learning and trialsand-errors with a return value (negative or positive) for each decision, in order to identify an optimal dialog strategy. Our method considers the Markov decision process (MPD) to be a framework for representation of speech dialog in which the states represent history and discourse context, the actions are dialog acts and the transition strategies are decisions on actions to take between states. We present our reinforcement learning approach with a novel objective function that is based on dialog quality as well as other quantitative factors. KeywordsLearning control systems; Unsupervised learning; Markov processes; Artificial intelligence; Intelligent systems