5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Transformation Smoothing for Speaker and Environmental Adaptation

M. J. F. Gales

Cambridge University Engineering Department, Cambridge, UK

Recently there has been much work done on how to transform HMMs, trained typically in a speaker-independent fashion on clean training data, to be more representative of data from a particular speaker or acoustic environment. These transforms are trained on a small amount of training data, so large numbers of components are required to share the same transform. Normally, each component is constrained to only use one transform. This paper examines how to optimally, in a maximum likelihood sense, assign components to transforms and allow each component, or component grouping, to make use of many transformations. The theory for obtaining both "weights" for each transform and transforms given a set of weights is given. The techniques are evaluated on both speaker and environmental adaptation tasks.

Full Paper

Bibliographic reference.  Gales, M. J. F. (1997): "Transformation smoothing for speaker and environmental adaptation", In EUROSPEECH-1997, 2067-2070.