Third International Conference on Spoken Language Processing (ICSLP 94)
This paper discusses several characteristics that might be responsible for the lack of naturalness we are faced with when listening to longer stretches of synthetic speech. By systematically manipulating these characteristics in the available Dutch allophone-based rule synthesis system, we evaluated their effects on naturalness and on generally perceived intelligibility. First results of our study show that 1) a speaker may use several tools to highlight some, informatively important, speech parts; 2) the effect of applying more spectral and temporal reduction in the synthesized speech yields an increase of perceived naturalness, without affecting the intelligibility as evaluated in a general way; 3) the effect of avoiding between-word coarticulation before words or word groups in focus, turned out to be overruled by the low quality of the synthesized speech, when presented in longer paragraphs and worked for some listeners negatively and for others positively when presented in one-sentence stimuli.
Bibliographic reference. Koopmans-van Beinum, Florien J. / Pols, Louis C. W. (1994): "Naturalness and intelligibility of rule-synthesized speech, supplied with specific spectro-temporal features derived from natural continuous speech", In ICSLP-1994, 1787-1790.