13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Speech Enhancement Using Sparse Convolutive Non-negative Matrix Factorization with Basis Adaptation

Michael A. Carlin (1), Nicolas Malyska (2), Thomas F. Quatieri (2)

(1) Center for Language and Speech Processing, Johns Hopkins University, Baltimore, MD, USA
(2) MIT Lincoln Laboratory, Lexington, MA, USA

We introduce a framework for speech enhancement based on convolutive non-negative matrix factorization that leverages available speech data to enhance arbitrary noisy utterances with no a priori knowledge of the speakers or noise types present. Previous approaches have shown the utility of a sparse reconstruction of the speech-only components of an observed noisy utterance. We demonstrate that an underlying speech representation which, in addition to applying sparsity, also adapts to the noisy acoustics improves overall enhancement quality. The proposed system performs comparably to a traditional Wiener filtering approach, and the results suggest that the proposed framework is most useful in moderate- to low-SNR scenarios.

Index Terms: speech enhancement, convolutive non-negative matrix factorization, basis adaptation, sparsity

Full Paper

Bibliographic reference.  Carlin, Michael A. / Malyska, Nicolas / Quatieri, Thomas F. (2012): "Speech enhancement using sparse convolutive non-negative matrix factorization with basis adaptation", In INTERSPEECH-2012, 583-586.