Sixth ISCA Workshop on Speech Synthesis
Articulatory speech synthesis currently has two perspectives. (i) Technical perspective: Due to progress in common computer hardware (general increase in computation rate) and software (usability of compilers and simulation software) it is now possible to develop comprehensive phonetic models of speech production reaching nearly real-time for the calculation of acoustic speech signals. Furthermore the phonetic knowledge increased to a degree that these production models now are capable of accomplishing a good up to high acoustic quality. Limitations are mainly the control modules. In this paper we argue for a self-learning input dependent gestural control model for articulatory speech synthesis. (ii) Theoretical perspective: A comprehensive articulatory speech synthesis system capable of producing high quality acoustic output necessarily incorporates a lot of knowledge on all phonetic aspects of speech production: articulatory sound targets, typical articulatory movement strategies for realizing sounds or syllables (e.g. coarticulation), a general concept for temporal coordination of speech relevant articulatory movements (i.e. speech gestures) etc. In this paper an example for such a system will be given and a suggestion for the still open question on strategies for control concepts for high-quality articulatory speech synthesis will be proposed.
Abstract as PDF Presentation (ppt) with many acoustic and audiovisual examples
Bibliographic reference. Kröger, Bernd J. (2007): "Perspectives for articulatory speech synthesis", In SSW6-2007, 391.