## Speech Prosody 2008## Campinas, Brazil |

(2) Centre for Speech Technology, KTH, Stockholm, Sweden

We investigate a recently introduced vector-valued representation of fundamental frequency variation, whose properties appear to be well-suited for statistical sequence modeling. We show what the representation looks like, and apply hidden Markov models to learn prosodic sequences characteristic of higher-level turn-taking phenomena. Our analysis shows that the models learn exactly those characteristics which have been reported for the phenomena in the literature. Further refinements to the representation lead to a 12-17% relative improvement in speaker change prediction for conversational spoken dialogue systems.

__Bibliographic reference.__
Laskowski, Kornel / Edlund, Jens / Heldner, Mattias (2008):
"Learning prosodic sequences using the fundamental frequency variation spectrum",
In *SP-2008*, 151-154.