13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Discriminative Training Using Non-uniform Criteria for Keyword Spotting on Spontaneous Speech

Chao Weng (1), Biing-Hwang (Fred) Juang (1), Daniel Povey (2)

(1) Center for Signal and Image Processing, Georgia Institute of Technology, Atlanta, GA, USA
(2) Center for Language and Speech Processing, Johns Hopkins University, Baltimore, MD, USA

In this work, we investigate the feasibility of applying our prior works on discriminative training (DT) using non-uniform criteria to a keyword spotting task on spontaneous conversational speech. One of DT methods, minimum classification error (MCE), is recast and efficiently implemented in the weighted finite state transducer (WFST) framework to fit a keyword spotting task. To validate our approach, we evaluate it on a conversational speech task, the credit card use subset of Switchboard, in both kinds of keyword spotting scenarios: one is when a large vocabulary continuous speech recognition (LVCSR) decoder is available, the other is when a simple word-loop grammar of limited vocabulary is used. The results show our approach performs well in both cases, achieving 2.77% and 3.15% figure of merits (FOMs) absolute improvements

Index Terms: LVCSR, keyword spotting, DT, non-uniform criteria, WFST

Full Paper

Bibliographic reference.  Weng, Chao / Juang, Biing-Hwang (Fred) / Povey, Daniel (2012): "Discriminative training using non-uniform criteria for keyword spotting on spontaneous speech", In INTERSPEECH-2012, 559-562.