Second ESCA/IEEE Workshop on Speech Synthesis

September 12-15, 1994
Mohonk Mountain House, New Paltz, NY, USA

Automatic Speech Segmentation for Concatenative Inventory Selection

Andrej Ljolje, Julia Hirschberg, Jan P. H. van Santen

AT&T Bell Laboratories, Murray Hill, NJ, USA

Development of multiple synthesis systems requires multiple transcribed speech databases. Here we explore an automatic technique for speech segmentation into phonetic segments applied to an Italian single speaker database. The output segmentation is compared to manual segmentations by two human transcribers. The performance is very good on voiced stop to vowel boundaries and unvoiced fricative to vowel boundaries, while vowel to vowel and voiced fricative to vowel boundaries are estimated less accurately.

