13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Improved Model Selection for the ASR-Driven Binary Mask

William Hartmann, Eric Fosler-Lussier

Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA

In a previous study, we proposed an alternative masking criterion for binary mask estimation based on the underlying linguistic information. We estimated this mask by selecting from a set of candidate masks at each frame based on the hypotheses from an ASR system. Our previous system provided an 8% reduction in WER. In this work, we present an improved method for selecting the correct candidate mask at each frame, increasing the reduction in WER to 14%. Our new method uses a discriminative sequence model and provides a framework that can incorporate other mask estimations as features.

Index Terms: speech recognition, binary mask estimation

Full Paper

Bibliographic reference.  Hartmann, William / Fosler-Lussier, Eric (2012): "Improved model selection for the ASR-driven binary mask", In INTERSPEECH-2012, 1203-1206.