Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
This paper describes an algorithm for determining stress groups in a spoken utterance from acoustic parameters of the speech waveform. It uses normalised and smoothed measures of duration and energy to produce an index of stress that is highest on focussed parts of the utterance and lowest at the boundaries between stress groups. 81% of stressed words were correctly identified, with a false detection rate of less than 5%. The location of contrastively focussed words in the utterance was correctly recognised in 76% of cases.
Bibliographic reference. Campbell, W. Nick (1992): "Prosodic encoding of English speech", In ICSLP-1992, 663-666.