First International Conference on Spoken Language Processing (ICSLP 90)
This paper addresses the problem of automatic speech recognition under Lombard and noise conditions. The main contributions include the statistical analysis of vocal tract and speech parameters under Lombard effect, and the formulation of a new speech recognition system which employs adaptive noise suppression and Lombard effect compensation front-end processors. The effects on formant location, bandwidth, and mel-cepstral parameters from noise and Lombard effect are presented. These parameters vary greatly, with significant variations across all phonemes for spectral tilt. Approximately half of all mel-cepstral parameters result in statistically significant variation from neutral. The significance of parameter variation between noisefree and noisy Lombard conditions shifts, suggesting the need for an alternate compensation for noise-free and noisy Lombard speech. A new recognition algorithm employing noise adaptive boundary detection, noise suppression, and voiced/unvoiced Lombard compensation is presented. Observed shift in mean cepstral values from neutral can be modeled using an exponential tilt, as suggested by Chen , but that the exponential form appears to differ for each phoneme class. A new Lombard effect compensator is formulated which allows varying degrees of compensation to be placed on voiced/unvoiced speech sections. Preliminary recognition results suggest that separate compensation of voiced and unvoiced speech sections improves recognition performance by as much as 10% over no compensation.
Bibliographic reference. Hansen, John H. L. / Bria, Oscar N. (1990): "Lombard effect compensation for robust automatic speech recognition in noise", In ICSLP-1990, 1125-1128.