This paper reports an experiment in synthesizing French connected speech using Maeda's digital simulation of the vocal-tract system. The dynamics of the vocal-tract shape are estimated from the dynamics of Electromagnetic Articulograph (EMA) sensors via Maedafs geometrical articulatory model. Time-varying characteristics of the glottis and the velopharyngeal port are set using empirical rules, while the fundamental frequency pattern is copied from the concurrently recorded audio signal. A subjective experiment was performed online to assess the perceived intelligibility and naturalness of the synthesized speech. Results indicate that a properly driven simulation of the vocal tract has the potential to provide a scientifically grounded alternative to the development of text-to-speech synthesis systems.
Bibliographic reference. Toutios, Asterios / Narayanan, Shrikanth (2013): "Articulatory synthesis of French connected speech from EMA data", In INTERSPEECH-2013, 2738-2742.