Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Exploiting Dendritic Autocorrelogram Structure to Identify Spectro-Temporal Regions Dominated by a Single Sound Source

Ning Ma, Phil Green, André Coy

University of Sheffield, UK

Autocorrelograms exhibit tree-like structures whose spines are located at a delay of 1/F0. This paper exploits the dendritic autocorrelogram structure for the identification of spectro-temporal regions dominated by a single periodic sound source in monaural acoustic mixtures. Each frame of the mixture is first segmented into different sound sources in the autocorrelogram domain. Local pitch estimates are formed for each source and used as a cue for temporal integration. A confidence score is computed for each time-frequency pixel in the grouped regions to determine its probability of belonging to the group. The system is evaluated using simultaneous speech in a coherence measuring experiment and also employed within an ASR system where it produces improved results for the Interspeech 2006 Speech Separation Challenge.

Full Paper

Bibliographic reference.  Ma, Ning / Green, Phil / Coy, André (2006): "Exploiting dendritic autocorrelogram structure to identify spectro-temporal regions dominated by a single sound source", In INTERSPEECH-2006, paper 1639-Mon3CaP.9.