13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Vowel Creation by Articulatory Control in HMM-based Parametric Speech Synthesis

Zhen-Hua Ling (1), Korin Richmond (2), Junichi Yamagishi (2)

(1) iFLYTEK Speech Lab, University of Science and Technology of China, China
(2) CSTR, University of Edinburgh, UK

This paper presents a method to produce a new vowel by articulatory control in hidden Markov model (HMM) based parametric speech synthesis. A multiple regression HMM (MRHMM) is adopted to model the distribution of acoustic features, with articulatory features used as external auxiliary variables. The dependency between acoustic and articulatory features is modelled by a group of linear transforms that are either estimated context-dependently or determined by the distribution of articulatory features. Vowel identity is removed from the set of context features used to ensure compatibility between the context-dependent model parameters and the articulatory features of a new vowel. At synthesis time, acoustic features are predicted according to the input articulatory features as well as context information. With an appropriate articulatory feature sequence, a new vowel can be generated even when it does not exist in the training set. Experimental results show this method is effective in creating the English vowel [ʌ] by articulatory control without using any acoustic samples of this vowel.

Index Terms: Speech synthesis, articulatory features, multiple-regression hidden Markov model

Full Paper

Bibliographic reference.  Ling, Zhen-Hua / Richmond, Korin / Yamagishi, Junichi (2012): "Vowel creation by articulatory control in HMM-based parametric speech synthesis", In INTERSPEECH-2012, 991-994.