4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
Acoustical analysis of speech and perceptual studies indicate that the dominant acoustic correlates of vowel perception are the frequencies of the first three formants. However, most vowels are not completely steady-state (even in isolation) and formant frequencies change with variation in the surrounding consonantal context, prosodic influences, speaking rate, and vocal tract length of the talker. In the present studies, both natural and synthetic syllables ("head" and "had") were used to explore the relative potency of average formant frequencies, vocalic duration, and formant frequency movement in vowel perception. A male talker was identified whose formant frequencies, at the midpoint of the words "had" and "head", were identical. However, these tokens differed in their voiced duration and movement of the first three formants and were also highly intelligible. Since the formant frequencies at midpoint could not distinguish these two words, listeners were clearly using different/additional information to guide perception. In the first study, vowel duration was varied. Digital waveform editing was used to generate two series, one based on "had" and the other based on "head". Overall, duration had little effect on listeners' classification of the stimuli. The second study employed synthetic series in which the formant movements of the first three formants were varied between those of the natural "had" and those of the natural "head". Here duration played a much larger role in listeners’ responses. Together, these data are a step toward uncovering the relative roles of formant frequencies, formant movement, and duration in vowel perception within fluent syllables.
Bibliographic reference. Sawusch, James R. (1996): "Effects of duration and formant movement on vowel perception", In ICSLP-1996, 2482-2485.