The correlogram is an important mid-level representation for periodic sounds which is widely used in sound source separation and pitch detection. However, it is very time consuming. In this paper, we presented a novel scheme for monaural voiced speech separation without computing correlograms. The noisy speech is firstly decomposing into time-frequency units. Pitch contour of the target speech is extracted according to the zero crossing rate of the units. Then we applied a comb filter to label each unit as target speech or intrusion. Compared with previous correlogram-based method, the proposed algorithm saves computing time and also yields better performance.
Bibliographic reference. Zhang, Xueliang / Liu, Wenju (2011): "Monaural voiced speech segregation based on pitch and comb filter", In INTERSPEECH-2011, 1741-1744.