Time-Frequency Masking for Blind Source Separation with Preserved Spatial Cues

Shadi Pirhosseinloo, Kostas Kokkinakis


In this paper, we address the problem of speech source separation by relying on time-frequency binary masks to segregate binaural mixtures. We describe an algorithm which can tackle reverberant mixtures and can extract the original sources while preserving their original spatial locations. The performance of the proposed algorithm is evaluated objectively and subjectively, by assessing the estimated interaural time differences versus their theoretical values and by testing for localization acuity in normal-hearing listeners for different spatial locations in a reverberant room. Experimental results indicate that the proposed algorithm is capable of preserving the spatial information of the recovered source signals while keeping the signal-to-distortion and signal-to-interference ratios high.


 DOI: 10.21437/Interspeech.2017-66

Cite as: Pirhosseinloo, S., Kokkinakis, K. (2017) Time-Frequency Masking for Blind Source Separation with Preserved Spatial Cues. Proc. Interspeech 2017, 1188-1192, DOI: 10.21437/Interspeech.2017-66.


@inproceedings{Pirhosseinloo2017,
  author={Shadi Pirhosseinloo and Kostas Kokkinakis},
  title={Time-Frequency Masking for Blind Source Separation with Preserved Spatial Cues},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={1188--1192},
  doi={10.21437/Interspeech.2017-66},
  url={http://dx.doi.org/10.21437/Interspeech.2017-66}
}