Fourth ISCA ITRW on Speech Synthesis

August 29 - September 1, 2001
Perthshire, Scotland

Synthesizing static vowels and dynamic sounds using a 3D vocal tract model

Olov Engwall

CTT, Centre for Speech Technology, KTH, Stockholm, Sweden

The KTH 3D Vocal Tract project aims at multimodal syntheSis, producing both visual and acoustic output from an articulatory model, The intra-oral visual synthasis has been developped over the last couple of years combing measurements from Magnetic Resonance Imaging, Electromagnetic articulography and Electropalatography. This paper presents the first acoustic evaluation of the model. Nine static vowels have been synthesized with fairly good correspondence between the reference subject's target and the model's formants. The synthesis is based on the area function calculated directly from the vocal tract model, sampling the cross-sectional area at 23 semi-polar planes. The generation of the vocal tract walls, modeled on one reference subject, the algorithms for collision handling and cross-sectional contour extraction and the results of the acoustic synthesis are presented.

