Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Single Channel Speech Enhancement by Frequency Domain Constrained Optimization and Temporal Masking

Wen Jin, Michael Scordilis

University of Miami, USA

A speech enhancement algorithm is proposed that exploits the masking properties of the human auditory system. The enhancement is formulated as a frequency domain constrained optimization problem. The noise components of the noisy speech are suppressed by a gain function subject to the constraint that both the signal distortion and residual noise should fall below the masking thresholds. Temporal as well as simultaneous masking effects are incorporated into the estimation of masking thresholds. The enhancement algorithm was tested with speech corrupted by white Gaussian and multitalker babble noise, respectively. Its performance was evaluated by ITU PESQ scores and segmental SNR. Experimental results indicate that the proposed gain function performs slightly but consistently better than a former perceptually motivated enhancement algorithm. Greater improvement is achieved by incorporating the temporal masking effects.

Full Paper

Bibliographic reference.  Jin, Wen / Scordilis, Michael (2006): "Single channel speech enhancement by frequency domain constrained optimization and temporal masking", In INTERSPEECH-2006, paper 1027-Tue3FoP.1.