13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Interactive Spoken Content Retrieval with Different Types of Actions Optimized By a Markov Decision Process

Tsung-Hsien Wen (1), Hung-Yi Lee (2), Lin-Shan Lee (1,2)

(1) Graduate Institute of Electrical Engineering; (2) Graduate Institute of Communication Engineering;
National Taiwan University, Taipei, Taiwan

Interaction with user is specially important for spoken content retrieval, not only because of the recognition uncertainty, but because the retrieved spoken content items are difficult to be shown on the screen and difficult to be scanned and selected by the user. The user cannot playback and go through all the retrieved items and then find out they are not what he is looking for. In this paper, we propose a new approach for interactive spoken content retrieval, in which the system can estimate the quality of the retrieved results, and take different types of actions to clarify the user's intention based on an intrinsic policy. The policy is optimized by a Markov Decision Process (MDP) trained with Reinforcement Learning based on a set of pre-defined rewards considering the extra burden given to the user.

Index Terms: Interactive SDR, MDP, Reinforcement Learning

Full Paper

Bibliographic reference.  Wen, Tsung-Hsien / Lee, Hung-Yi / Lee, Lin-Shan (2012): "Interactive spoken content retrieval with different types of actions optimized by a Markov decision process", In INTERSPEECH-2012, 2458-2461.