Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
Naturalness of 4 Swedish sentence-pairs generated by 4 speech synthesis systems as a function of (a) stimulus range 'context' presented to the Subjects (Ss) and (b) assessment method used (5-point Category Rating (CR) scale, 11-point CR scale or free number Magnitude Estimation (ME)) was assessed. Three stimulus range conditions were created by using natural speech as an internal (hidden) reference condition (only sentences generated by the 4 synthesiser systems systems were presented), using natural speech as a good external reference condition and using natural speech as both a good and bad reference. 72 Ss participated, divided into 3 groups, where group I assessed Naturalness of the 4 systems (no external reference, only internal), group II assessed the 4 systems + natural speech (good external reference) and group III assessed the 4 systems + natural speech + distorted natural speech (good and bad external reference), using both ME and CR scales (5- or 11-point). CRs of Naturalness are sensitive both to changes in response and stimulus range. An 11-point CR scale generates lower ratings for each of the 4 synthesiser systems than a 5-point rating scale (appr. -0.5 unit) regardless of stimulus range. Increasing stimulus range (by introducing natural speech and natural and distorted natural speech as part of the stimulus context), affects the Naturalness ratings of the 4 synthesis systems. For both CR scales (5- and 11-point), decreased ratings (-) as a function of increasing stimulus range were obtained for all 4 systems (up to - 1.0 unit). This decrease is quite contrary to what has been found for Picture Quality (Jones & Marks, 1985; CCIR, 1986), where increase in Picture Quality ratings (+) were obtained as a function of increasing stimulus range using 5-point CR scale as well as ME. The relationship ME of Naturalness and stimulus range was more complex. Naturalness was found to be a metathetic continuum.
Bibliographic reference. Goldstein, Mikael / Lindström, Björn / Till, Ove (1992): "Some aspects on context and response range effects when assessing naturalness of Swedish sentences generated by 4 synthesiser systems", In ICSLP-1992, 1339-1342.