Fourth ISCA ITRW on Speech Synthesis
August 29 - September 1, 2001
This paper presents two diphone-based Turkish text-to-speech systems. The first system is realized inside the MBROLA project, a freely available multilingual speech synthesizer and the second system is based on shape-invariant harmonic modeling. Both synthesizers use the same parametric representations of two diphone databases (male, female) obtained by processing speech data with a pitch- asynchronous, fixed frame length harmonic/noise analyzer. To obtain a pitch-synchronous representation from the original asynchronous representation for the harmonic synthesizer, harmonic phases are submitted to a phase shifting algorithm, which also estimates maximum harmonic frequencies for each frame based on the evolution of harmonic phases. The MBROLA based synthesizer has been implemented in a rudimentary TTS system inside EULER and the harmonic synthesizer captures files produced by the EULER system to perform synthesis. Informal listening tests are being performed for quality assessment.
Bibliographic reference. Bozkurt, Baris / Dutoit, Thierry (2001): "An implementation and evaluation of two diphone-based synthesizers for Turkish", In SSW4-2001, paper 110.