13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Employing Sentence Structure: Syntax Trees as Prosody Generators

Sarah Hoffmann, Beat Pfister

Speech Processing Group, ETH Zürich, Switzerland

In this paper, we describe a prosody generation system for speech synthesis that makes direct use of syntax trees to obtain duration and pitch. Instead of transforming the tree through special rules or extracting isolated features from the tree, we make use of the tree structure itself to construct a superpositional model that is able to learn the relation between syntax and prosody. We implemented the system in our SVOX text-to-speech system and evaluated it against the existing rule-based system. Informal listening tests showed that structural information from the tree is carried over to the prosody.

Index Terms: speech synthesis, prosody, syntax analysis

Full Paper

Bibliographic reference.  Hoffmann, Sarah / Pfister, Beat (2012): "Employing sentence structure: syntax trees as prosody generators", In INTERSPEECH-2012, 470-473.