In this paper, we describe a prosody generation system for speech synthesis that makes direct use of syntax trees to obtain duration and pitch. Instead of transforming the tree through special rules or extracting isolated features from the tree, we make use of the tree structure itself to construct a superpositional model that is able to learn the relation between syntax and prosody. We implemented the system in our SVOX text-to-speech system and evaluated it against the existing rule-based system. Informal listening tests showed that structural information from the tree is carried over to the prosody.
Index Terms: speech synthesis, prosody, syntax analysis
Bibliographic reference. Hoffmann, Sarah / Pfister, Beat (2012): "Employing sentence structure: syntax trees as prosody generators", In INTERSPEECH-2012, 470-473.