ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing
ICC Jeju, Korea
An algorithm for the separation of sound sources is presented. Each source is parametrized as a convolution between a time-frequency magnitude spectrogam and an onset vector. The source model is able to represent several types of sounds, for example repetitive drum sounds and harmonic sounds with modulations. An iterative algorithm is proposed for the estimation the parameters. The algorithm is based on minimizing the reconstruction error and the number of onsets. The number of onsets is minimized by applying the sparse coding scheme for onset vectors. A way of modeling the loudness perception of the human auditory system is proposed. The method compresses high-energy sources, and enables the separation of lowenergy sources which are perceptually significant. The algorithm is able to separate meaningful sources from real-world signals. Simulation experiments were carried out using mixtures of harmonic instruments. Demonstration signals are available at http://www.cs.tut.fi/~tuomasv/demopage.html .
Bibliographic reference. Virtanen, Tuomas (2004): "Separation of sound sources by convolutive sparse coding", In SAPA-2004, paper 55.