Speech Prosody 2010

Chicago, IL, USA
May 10-14, 2010

Cross-Genre Training for Automatic Prosody Classification

Anna Margolis (1,2), Mari Ostendorf (1), Karen Livescu (2)

(1) Department of Electrical Engineering, University of Washington, Seattle, WA, USA
(2) TTI-Chicago, Chicago, IL, USA

We consider methods for training a prosodic classifier using labeled training data from a different genre than the one on which the system will be deployed. Two binary tasks are considered: word-level pitch accent and phrase boundary detection. Using radio news and conversational telephone speech, we consider cross-genre training using acoustic and textual features, and find that acoustic features transfer better than text features in most cases. We also find that a single classifier trained from both genres nearly matches genre-dependent performance. We then consider some simple unsupervised domain adaptation approaches, including class proportion adjustment, sample selection bias correction, and feature normalization. With the exception of class proportion adjustment, which is slightly helpful in one case but proves unstable, none of the approaches improve cross-genre performance over the baseline.

Index Terms: prosody recognition, domain adaptation

Full Paper

Bibliographic reference.  Margolis, Anna / Ostendorf, Mari / Livescu, Karen (2010): "Cross-genre training for automatic prosody classification", In SP-2010, paper 113.