Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Learning Multi-Goal Dialogue Strategies Using Reinforcement Learning with Reduced State-Action Spaces

Heriberto Cuayįhuitl, Steve Renals, Oliver Lemon, Hiroshi Shimodaira

University of Edinburgh, UK

Learning dialogue strategies using the reinforcement learning framework is problematic due to its expensive computational cost. In this paper we propose an algorithm that reduces a state-action space to one which includes only valid state-actions. We performed experiments on full and reduced spaces using three systems (with 5, 9 and 20 slots) in the travel domain using a simulated environment. The task was to learn multi-goal dialogue strategies optimizing single and multiple confirmations. Average results using strategies learnt on reduced spaces reveal the following benefits against full spaces: 1) less computer memory (94% reduction), 2) faster learning (93% faster convergence) and better performance (8.4% less time steps and 7.7% higher reward).

Full Paper

Bibliographic reference.  Cuayįhuitl, Heriberto / Renals, Steve / Lemon, Oliver / Shimodaira, Hiroshi (2006): "Learning multi-goal dialogue strategies using reinforcement learning with reduced state-action spaces", In INTERSPEECH-2006, paper 1282-Mon2FoP.6.