13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Speech Enhancement by Online Non-negative Spectrogram Decomposition in Nonstationary Noise Environments

Zhiyao Duan (1), Gautham J. Mysore (2), Paris Smaragdis (2,3)

(1) Department of EECS, Northwestern University, Evanston, IL, USA
(2) Advanced Technology Labs, Adobe Systems Inc., San Francisco, CA, USA
(3) Department of Computer Science, UIUC, Urbana, IL, USA

Classical single-channel speech enhancement algorithms have two convenient properties: they require pre-learning the noise model but not the speech model, and they work online. How- ever, they often have difficulties in dealing with non-stationary noise sources. Source separation algorithms based on non- negative spectrogram decompositions are capable of dealing with non-stationary noise, but do not possess the aforemen- tioned properties. In this paper we present a novel algorithm that combines the advantages of both classical algorithms and nonnegative spectrogram decomposition algorithms. Experi- ments show that it significantly outperforms four categories of classical algorithms in non-stationary noise environments.

Full Paper

Bibliographic reference.  Duan, Zhiyao / Mysore, Gautham J. / Smaragdis, Paris (2012): "Speech enhancement by online non-negative spectrogram decomposition in nonstationary noise environments", In INTERSPEECH-2012, 595-598.