Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
A computational model of auditory scene analysis is presented which is able to segregate speech from an arbitrary noise intrusion. A representational approach to hearing is adopted, in which models of higher auditory organization - auditory maps - are employed to make particular aspects of the auditory scene explicit. Specifically, maps extract information about periodicity, onsets, offsets and frequency transitions in different spectral regions. The performance of the system has been evaluated using the task of segregating speech from a variety of interfering noises, such as "cocktail party" noise, tones, music and other speech. The waveform of the segregated speech can be recovered by a resynthesis technique, and informal listening tests suggest that the speech is highly intelligible and quite natural. Additionally, signal-to-noise ratios have been compared before and after segregation by the system. An improvement in signal-to-noise ratio is obtained for each noise intrusion.
Bibliographic reference. Brown, Guy J. / Cooke, Martin P. (1992): "A computational model of auditory scene analysis", In ICSLP-1992, 523-526.