On Discriminative Framework for Single Channel Audio Source Separation

Arpita Gang, Pravesh Biyani

Single channel source separation (SCSS) algorithms that utilise discriminative source models perform better in comparison to those that are trained independently. However, all the aspects of training discriminative models have not been addressed in the literature. For instance, the choice of dimensions of source models (number of columns of NMF, Dictionary etc) not only influences the fidelity of a given source but also impacts the interference introduced in it. Therefore choosing a right dimension parameter for every source model is crucial for an effective separation. In fact, the similarity between the constituent sources can be different for different mixtures and thus, dimensions should also be chosen specific to the sources in the concerned mixture. Further, separation of a given constituent from a mixture, assuming remaining to be interferers, offers more freedom for the particular constituent and hence provide better separation. In this paper, we propose a generic discriminative learning framework where we separate one source at a time and embed our dimension search algorithm in the training of discriminative source models. We apply our framework on the NMF based SCSS algorithms and demonstrate a performance improvement in separation for both speech-speech and speech-music mixture.

DOI: 10.21437/Interspeech.2016-701

Cite as

Gang, A., Biyani, P. (2016) On Discriminative Framework for Single Channel Audio Source Separation. Proc. Interspeech 2016, 565-569.

author={Arpita Gang and Pravesh Biyani},
title={On Discriminative Framework for Single Channel Audio Source Separation},
booktitle={Interspeech 2016},