First European Conference on Speech Communication and Technology

Paris, France
September 27-29, 1989

Speaker Normalization via a Linear Transformation on a Perceptual Feature Space and its Benefits in ASR Adaptation

Gu Yong, John S. Mason

Department of Electrical and Electronic Engineering, University College, Swansea, UK

This paper examines inter-speaker variability of perceptually weighted features known as PLP. The motivation is to find a successful transformation between speakers for use in adaptation in speech recognition. Weighted cepstral distance measures are examined, including a combination of the unweighted d-CEP and the root-power-sum slope distortion measure d-RPS This is shown to be most effective in speaker-independent ASR. It is found that differences between two speakers are exhibited relatively clearly and consistently on the PLP/RPS domain. The attenuation of these differences by a linear transformation forms the basis of the proposed adaptation method for speech recognition. Recognition experiments indicate clearly the effectiveness of the method.

Full Paper

Bibliographic reference.  Yong, Gu / Mason, John S. (1989): "Speaker normalization via a linear transformation on a perceptual feature space and its benefits in ASR adaptation", In EUROSPEECH-1989, 1258-1261.