INTERSPEECH 2006 - ICSLP
Earlier studies have shown that degradation due to environmental background noise is non-uniform across various phoneme classes of speech. In this study, we present an improved formulation of single channel constrained iterative speech enhancement (AutoLSP) that follows a rover based paradigm. The new approach overcomes some of the drawbacks observed earlier in the baseline AutoLSP system. First, it eliminates the sensitivity to proper determination of the terminating iteration. Second, it employs a phone level non-uniform enhancement approach which significantly improves perceptual quality of the overall utterance. Third, audible noise components are suppressed by incorporating an auditory masked threshold (AMT) framework. The proposed algorithm is evaluated using Itakura-Saito (IS) objective quality measure over four noise sources and two SNR levels. Comparative evaluations with other baseline systems (AutoLSP, log-MMSE) reveal that the new algorithm exhibits consistent quality improvement for each noise case over all phoneme classes in the TIMIT corpus. Reduction in IS distance over degraded speech is observed in the range of 35.09-46.88%. The Rover scheme outperforms AutoLSP and log-MMSE by 9.21% and 11.19% respectively using IS scores.
Bibliographic reference. Das, Amit / Hansen, John H. L. (2006): "Decision directed constrained iterative speech enhancement", In INTERSPEECH-2006, paper 1866-Tue3FoP.8.