EUROSPEECH 2001 Scandinavia
The paper presents the speaker normalization technique we implemented in a teach ing and training system for hearing handicapped children with the goal to reduce inter-speaker variability in time-frequency speech representation. In an effort to reduce variance caused by variation in vocal tract shape among speakers, a formant based nonlinear frequency warping approach to vocal tract normalization i s investigated. The proposed method can be efficiently realized in an Analysis by Synthesis framework. After the speech decomposition into the vocal tract envelope and excitation model, the vocal tract envelope is warped by the estimated frequency war ping function, while the excitation characteristics are mapped to the reference speaker excitation. The results have shown significant spectral distance decrease for correctly pronounced words between test and the reference speaker after the normalization has been applied, while for poor pronunciation by the test speaker the spectral distance remains relatively high.
Bibliographic reference. Ogner, Marcel / Kacic, Zdravko (2001): "Speaker normalization based on test to reference speaker mapping", In EUROSPEECH-2001, 1507-1510.