EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Design of an Optimal Continuous Speech Database for Text-To-Speech Synthesis Considered as a Set Covering Problem

Helene Francois, Olivier Boeffard

IRISA, Université Rennes 1, ENSSAT, France

Text-to-speech synthesis can be carried out by concatenation of acoustic units obtained from a continuous speech database. This paper presents the optimization of such as database according to phonetic criteria. A large corpus of texts is assembled (311 572 sentences), phonetized automatically and condensed (12 217 sentences) to retain only 10 tokens of the most frequent triphonemes. This is a NP-hard problem of set covering. It has been solved in an approximate way using a greedy algorithm. The condensed database covers 25% of the initial distinct triphonemes, each being represented by 10 tokens at least, which allows 95% of the triphoneme tokens of the initial corpus to be covered. The distribution of the triphonemes remains proportional to their initial statistical appearance.

Full Paper

Bibliographic reference.  Francois, Helene / Boeffard, Olivier (2001): "Design of an optimal continuous speech database for text-to-speech synthesis considered as a set covering problem", In EUROSPEECH-2001, 829-832.