GFM-Voc: A Real-Time Voice Quality Modification System

Olivier Perrotin, Ian McLoughlin

This article introduces GFM-Voc, a new system that allows high-quality and real-time voice modification, including both vocalic formants shifting, and voice quality manipulation. In particular, the system is based on the implementation of a newly developed source-filter decomposition method, called GFM-IAIF, that allows the extraction of both vocal tract and glottis spectral envelopes as a compact set of filter parameters. The latter are then controllable through a GUI, before re-synthesis of the speech with the modified parameters. The system requires no training, and operates on any voice, male or female, without tuning. Given the close link between spectral parameters and speech perception, this system provides an intuitive way to independently manipulate the vocalic formants and the spectral shape of the glottal flow that is responsible for voice quality perception. Additionally, rules have been implemented to link the glottis parameters to high-level voice quality parameters such as vocal force and tenseness. Examples of applications for this system include expressive speech synthesis, by adding the system at the end of a speech synthesiser pipeline, auditory feedback perturbation to study a speaker’s response to modified speech, and speech therapy.

Cite as: Perrotin, O., McLoughlin, I. (2019) GFM-Voc: A Real-Time Voice Quality Modification System. Proc. Interspeech 2019, 3685-3686.

