Building Dialogue POMDPs from Expert Dialogues: An end-to-end approach

This book discusses the Partially Observable Markov Decision Process (POMDP) framework applied in dialogue systems. It presents POMDP as a formal framework to represent uncertainty explicitly while supporting automated policy solving. The authors propose and implement an end-to-end learning approach for dialogue POMDP model components. Starting from scratch, they present the state, the transition model, the observation model and then finally the reward model from unannotated and noisy dialogues. These altogether form a significant set of contributions that can potentially inspire substantial further work. This concise manuscript is written in a simple language, full of illustrative examples, figures, and tables.