Learning Observation Models for Dialogue POMDPs

The SmartWheeler project aims at developing an intelligent wheelchair for handicapped people. In this paper, we model the dialogue manager of SmartWheeler in MDP and POMDP frameworks using its collected dialogues. First, we learn the model components of the dialogue MDP based on our previous works. Then, we extend the dialogue MDP to a dialogue POMDP, by proposing two observation models learned from dialogues: one based on learned keywords and the other based on learned intentions. The subsequent keyword POMDP and intention POMDP are compared based on accumulated mean reward in simulation runs. Our experimental results show that the quality of the intention model is significantly higher than the keyword one.