Speech Prosody 2012
The generation of duration of speech units from linguistic in- formation, as one component of a prosody model, is consid- ered to be a requirement for natural sounding speech synthesis. This paper investigates the use of a multi-level exemplar-based model for duration generation for the purposes of expressive speech synthesis. The multi-level exemplar-based model has been proposed in the literature as a cognitive model for the pro- duction of duration. The implementation of this model for dura- tion generation for speech synthesis is not straightforward and requires a set of modifications to the model and that the linguis- tically related units and the context of the target units should be taken into consideration. The work presented in this paper implements this model and presents a solution to these issues through the use of prosodic-syntactic correlated data, full con- text information of the input example and corpus exemplars.
Index Terms: speech prosody, duration generation, exemplar- based model
Bibliographic reference. Abou-Zleikha, Mohamed / Székely, Éva / Cahill, Peter / Carson-Berndsen, Julie (2012): "Multi-level exemplar-based duration generation for expressive speech synthesis", In SP-2012, 59-62.