Fourth ISCA ITRW on Speech Synthesis

August 29 - September 1, 2001
Perthshire, Scotland

An implementation and evaluation of two diphone-based synthesizers for Turkish

Baris Bozkurt and Thierry Dutoit

Multitel-TCTS Lab, Faculté Polytechnique de Mons, Belgium

This paper presents two diphone-based Turkish text-to-speech systems. The first system is realized inside the MBROLA project, a freely available multilingual speech synthesizer and the second system is based on shape-invariant harmonic modeling. Both synthesizers use the same parametric representations of two diphone databases (male, female) obtained by processing speech data with a pitch- asynchronous, fixed frame length harmonic/noise analyzer. To obtain a pitch-synchronous representation from the original asynchronous representation for the harmonic synthesizer, harmonic phases are submitted to a phase shifting algorithm, which also estimates maximum harmonic frequencies for each frame based on the evolution of harmonic phases. The MBROLA based synthesizer has been implemented in a rudimentary TTS system inside EULER and the harmonic synthesizer captures files produced by the EULER system to perform synthesis. Informal listening tests are being performed for quality assessment.

