12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

The Effects of Phoneme Errors in Speaker Adaptation for HMM Speech Synthesis

Bálint Tóth, Tibor Fegyó, Géza Németh

BME, Hungary

In this paper the phoneme errors in adaptation data of HMM based synthesis is investigated. Phoneme errors are likely to appear in automatic speech recognition (ASR) based transcriptions. The research also investigates the perspective of merely ASR transcription based unsupervised adaptation. To achieve better quality a new method is introduced for selecting an optimal subset of ASR transcription based adaptation data. Quality evaluation of the method was also performed. The results showed that adaptation was successful even on higher than 50% phoneme error rates.

