Fourth ISCA ITRW on Speech Synthesis
August 29 - September 1, 2001
Starting with a rule based prosody generation system, we try to improve the naturalness of the generated prosody by using a corpus based approach, without losing the advantages of the rule based method. To achieve this, a prosodic unit selection method is introduced, which is similar in its approach to the waveform unit selection used by large unit inventory waveform concatenation systems.
Trying to avoid the problem of incomplete unit description in existing prosodic databases, a new method of data collection and labeling is introduced. A small database of the proposed kind was collected, and results of applying selection algorithm to it are given.
The approach described in this paper could be useful for improving prosody naturalness and assisting in personalizing prosody. It requires relatively little expert manual work, and can be used for small footprint TTS systems.
Bibliographic reference. Meron, Joram (2001): "Prosodic unit selection using an imitation speech database", In SSW4-2001, paper 113.
There are 5 pairs of sentences. In each pair one sentence was
produced using the rule prosody (rule_X.wav), and the other by the
method suggested in this paper (imit_X.wav).