12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Probabilistic Spectrum Envelope: Categorized Audio-Features Representation for NMF-Based Sound Decomposition

Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki

Kobe University, Japan

NMF (Non-negative Matrix Factorization) has been one of the most useful techniques for audio signal analysis in recent years. In particular, supervised NMF, in which a large number of samples is used for analyzing a signal, is garnering much attention in sound source separation or noise reduction research. However, because such methods require all the possible samples for the analysis, it is hard to build a practical system based on this method. In this paper, we propose a novel method of signal analysis that combines the NMF and probabilistic approaches. In this approach, it is assumed that each audio-source category (such as phonemes or musical instruments) has an environment-invariant feature, called a probabilistic spectrum envelope (PSE). At the start, the PSE of each category is learned using a technique based on Gaussian Process Regression. Then, the observed spectrum is analyzed using a combination of supervised NMF and Genetic Algorithm with pre-trained PSEs.

Full Paper

Bibliographic reference.  Nakashika, Toru / Takiguchi, Tetsuya / Ariki, Yasuo (2011): "Probabilistic spectrum envelope: categorized audio-features representation for NMF-based sound decomposition", In INTERSPEECH-2011, 1765-1768.