13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Decoding of Uncertain Features Using the Posterior Distribution of the Clean Data for Robust Speech Recognition

Ahmed Hussen Abdelaziz, Dorothea Kolossa

Digital Signal Processing Group, Institute of Communication Acoustics, Ruhr-Universität Bochum, Germany

The emerging field of uncertainty-of-observation techniques has recently been successful in improving performance of automatic speech recognition in non-stationary noisy environments by considering the preprocessed feature vectors not as deterministic but rather as random variables containing noisy observations of the underlying, hidden, clean speech features. A number of resulting modifications to the speech decoding rule have been proposed in this framework, and two of these rules, uncertainty decoding and modified imputation, are especially straightforward in their implementation and have shown good success in many environments. In the following, we will present a new decoding rule that shares the simplicity of these two strategies, but results in consistently better accuracy over a wide range of non-stationary noise conditions. In addition, we provide a unifying view of these three strategies using a simple Bayesian network, from which insights into the relationships among them are deduced.

Index Terms: Robust speech recognition, uncertainty decoding, modified imputation

Full Paper

Bibliographic reference.  Abdelaziz, Ahmed Hussen / Kolossa, Dorothea (2012): "Decoding of uncertain features using the posterior distribution of the clean data for robust speech recognition", In INTERSPEECH-2012, 2634-2637.