Fourth ISCA ITRW on Speech Synthesis

August 29 - September 1, 2001
Perthshire, Scotland

A Multi-lingual System for the Determination of Phonetic Word Stress Using Soft Feature Selection by Neural Networks

Horst-Udo Hain and Hans Georg Zimmermann

Siemens Corporate Technology, Munich, Germany

Any TTS system requires a routine to determine the transcription of out of vocabulary (OOV) words. This transcription contains three information: the phoneme sequence, the position of syllable boundaries and the position of word stress. In the TTS system "Papageno", the phonemes and syllable boundaries are determined by a neural network proposed in [1]. In the same paper also a second network for word stress determination was proposed. A similar architecture is used here, enhanced by a diagonal matrix between the input and the hidden layer penalised by weight decay. Weight decay is a strategy to limit the growth of a weight unless it is really necessary. It can be used to improve the generalisation ability of the network.

Full Paper

Bibliographic reference.  Hain, Horst-Udo / Zimmermann, Hans Georg (2001): "A multi-lingual system for the determination of phonetic word stress using soft feature selection by neural networks", In SSW4-2001, paper 120.