5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Acoustic and Perceptual Properties of Phonemes in Continuous Speech as a Function of Speaking Rate

Hisao Kuwabara

Department of Electronics and Information Science, Teikyo University of Science & Technology, Uenohara, Kitatsuru-gun, Yamanashi, Japan

An investigation has been made for individual phonemes focusing mainly on their duration in continuous speech spoken in different rates: fast, normal, and slow. Fifteen short sentences uttered by four male speakers have been used as the speech material which comprises a total of 291 morae. Normal speaking rate (n-speech) is, on average, 150 milliseconds/mora (or 400 morae/minute) and the four speakers have been asked to read the sentences twice as fast as (f-speech) and 1/2 times as slow as (s-speech) the normal speed in reference to the n- speech. Among consonants, the greatest influence has been found to occur on the syllabic nasal /N/ and the least on the voiceless stop /t/ in f-speech. For the s-speech, /N/ has also been found to be the greatest but the least is voiced stop /d/. The ratio of duration between consonant and vowel of a CV-syllable in the f-speech is kept almost the same as that in the n-speech while vowel lengthening becomes significantly large in the s-speech. As it is expected, formant frequencies of vowels differ significantly between the three rates. Five vowels tend to be close together on the F1-F2 plane as the speaking rate becomes fast reflecting the neutralization of vowels. However, average difference of the third formant has been found to be very small.

Full Paper

Bibliographic reference.  Kuwabara, Hisao (1997): "Acoustic and perceptual properties of phonemes in continuous speech as a function of speaking rate", In EUROSPEECH-1997, 1003-1006.