This book discusses the Partially Observable Markov Decision Process (POMDP) framework applied in dialogue systems. It presents POMDP as a formal framework to represent uncertainty explicitly while supporting automated policy solving. The authors propose and implement an end-to-end learning approach for dialogue POMDP model components. Starting from scratch, they present the state, the transition model, the observation model and then finally the reward model from unannotated and noisy dialogues. These altogether form a significant set of contributions that can potentially inspire substantial further work. This concise manuscript is written in a simple language, full of illustrative examples, figures, and tables.
Providesinsights on building dialogue systems to be applied in real domain
Illustrateslearning dialogue POMDP model components from unannotated dialogues in aconcise format
Introducesan end-to-end approach that makes use of unannotated and noisy dialogue forlearning each component of dialogue POMDPs