EUROSPEECH 2001 Scandinavia
In this paper both acoustical as well as textual correlates of prominence are discussed. Prominence, as we use it, is defined at the word level and is based on listener judgments. A selection of useful acoustic input features is tested for classification of prominent words, with the help of Feed Forward Nets. We use spoken sentences from many different speakers, taken from the Dutch Polyphone corpus of telephone speech. For an independent test set of 1,000 sentences about 72% of the words are correctly classified as prominent or not. At the text input level we also developed an algorithm, using linguistic/syntactical features derived from text only, to predict prominence. The prediction agrees with the perceived prominence in 82.6% of the cases.
Bibliographic reference. Streefkerk, Barbertje M. / Pols, Louis C. W. / Bosch, Louis F. M. ten (2001): "Up to what level can acoustical and textual features predict prominence", In EUROSPEECH-2001, 811-814.