ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing

ICC Jeju, Korea
October 3, 2004

Separation of Sound Sources by Convolutive Sparse Coding

Tuomas Virtanen

Institute of Signal Processing, Tampere University of Technology, Finland

An algorithm for the separation of sound sources is presented. Each source is parametrized as a convolution between a time-frequency magnitude spectrogam and an onset vector. The source model is able to represent several types of sounds, for example repetitive drum sounds and harmonic sounds with modulations. An iterative algorithm is proposed for the estimation the parameters. The algorithm is based on minimizing the reconstruction error and the number of onsets. The number of onsets is minimized by applying the sparse coding scheme for onset vectors. A way of modeling the loudness perception of the human auditory system is proposed. The method compresses high-energy sources, and enables the separation of lowenergy sources which are perceptually significant. The algorithm is able to separate meaningful sources from real-world signals. Simulation experiments were carried out using mixtures of harmonic instruments. Demonstration signals are available at http://www.cs.tut.fi/~tuomasv/demopage.html .


Full Paper

Bibliographic reference.  Virtanen, Tuomas (2004): "Separation of sound sources by convolutive sparse coding", In SAPA-2004, paper 55.