First International Conference on Spoken Language Processing (ICSLP 90)
This paper is concerned with the effects of different pausing strategies on word recognizability in less than optimal, connected synthetic speech. The main hypothesis behind the experiments to be described is that human recognition of poor quality speech can be improved by inserting well formed speech pauses at appropriate positions within sentences. This hypothesis was tested with fifteen syntactically and semantically well formed Dutch sentences, each 36 words and 68. syllables long, and realized by diphone concatenation, with appropriate synthetic sentence melodies. There were four stimulus conditions, viz.: 1) No pauses at all, 2) five pauses at syntactically motivated positions, 3) five pauses immediately preceding informative content words, and 4) five pauses at fixed intervals of six words each. Realizations of all versions of each sentence were given the same overall duration. Each sentence was listened to by 1O listeners in a blocked design involving 4O listeners in total. Each listener had a form giving away the function words of each sentence and was asked to fill in the content words. Main results are the following: a) Syntactically motivated pause positions lead to higher recognition scores than any of the other conditions. b) Words immediately preceding pauses profit much more from syntactically motivated pauses and suffer much more from ill placed pauses in their recognition scores than any other words. c) Monosyllables profit and suffer much more from well placed and ill placed pauses respectively than longer words.
Bibliographic reference. Nooteboom, Sieb G. / Scharpff, P. / Heuven, Vincent J. Van (1990): "Effects of several pausing strategies on the recognizability of words in synthetic speech", In ICSLP-1990, 385-388.