14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Spectro-Temporal Post-Enhancement Using MMSE Estimation in NMF Based Single-Channel Source Separation

Emad M. Grais, Hakan Erdogan

Sabancı Üniversitesi, Turkey

We propose to use minimum mean squared error (MMSE) estimates to enhance the signals that are separated by nonnegative matrix factorization (NMF). In single channel source separation (SCSS), NMF is used to train a set of basis vectors for each source from their training spectrograms. Then NMF is used to decompose the mixed signal spectrogram as a weighted linear combination of the trained basis vectors from which estimates of each corresponding source can be obtained. In this work, we deal with the spectrogram of each separated signal as a 2D distorted signal that needs to be restored. A multiplicative distortion model is assumed where the logarithm of the true signal distribution is modeled with a Gaussian mixture model (GMM) and the distortion is modeled as having a log-normal distribution. The parameters of the GMM are learned from training data whereas the distortion parameters are learned online from each separated signal. The initial source estimates are improved and replaced with their MMSE estimates under this new probabilistic framework. The experimental results show that using the proposed MMSE estimation technique as a post enhancement after NMF improves the quality of the separated signal.

Full Paper

Bibliographic reference.  Grais, Emad M. / Erdogan, Hakan (2013): "Spectro-temporal post-enhancement using MMSE estimation in NMF based single-channel source separation", In INTERSPEECH-2013, 3279-3283.