Survey Talk: Realistic Physics-Based Computational Voice Production

Oriol Guasch

Simulating the very complex physics of voice on realistic vocal tract geometries looked daunting a few years ago but has recently experienced a very significant boom. Earlier works mainly dealt with vowel production. Solving the wave equation in a three-dimensional vocal tract suffices for that purpose. As we depart from vowels, however, things quickly get harder. Simulating a few milliseconds of sibilant /s/ demands high-performance computers to solve the sound turbulent eddies generate. Producing a diphthong implies dealing with dynamic geometries. A syllable like /sa/ seems out of reach of current computation capabilities, though some modelling techniques inspired on one-dimensional approaches may lead to more than acceptable results. The shaping of dynamic vocal tracts shall be linked to biomechanical models to gain flexibility and achieve a more complete representation on how, we humans, generate voice. Besides, including phonation in the computations implies resolving the vocal fold self-oscillations and the very demanding coupling of the mechanical, fluid and acoustic fields. Finally, including naturalness in computational voice generation is a newborn and challenging task. In this talk, a general overview on realistic physics-based computational voice production will be given. Current achievements and remaining challenges will be highlighted and discussed.

Cite as: Guasch, O. (2019) Survey Talk: Realistic Physics-Based Computational Voice Production. Proc. Interspeech 2019.

  author={Oriol Guasch},
  title={{Survey Talk: Realistic Physics-Based Computational Voice Production}},
  booktitle={Proc. Interspeech 2019}