DiSS-LPSS Joint Workshop 2010
The 5th Workshop on Disfluency in Spontaneous Speech
This paper presents a series of experiments on automatic transcription and classification of fillers and feedbacks in conversational speech corpora. A feature combination of PCA projected normalized F0 Constant-Q Cepstra and MFCCs has shown to be effective for standard Hidden Markov Models (HMM). We demonstrate how to model both speaker channel with coupled HMMs and show expected improvements. In particular, we explore model topologies which take advantage of predictive cues for fillers and feedback. This is done by initializing the training with special labels located immediately before fillers in the same channel and immediately before feedbacks in the other speaker channel. The average F-score for a standard HMM is 34.1%, for a coupled HMM 36.7% and for a coupled HMM with pre-filler and pre-feedback labels 40.4%. In a pilot study the detectors are found to be useful for semi-automatic transcription of feedback and fillers in socializing conversations.
Index Terms. fillers, feedbacks, coupled hidden markov models, cross-speaker modeling, conversation
Bibliographic reference. Neiberg, Daniel / Gustafson, Joakim (2010): "Modeling conversational interaction using coupled Markov chains", In DiSS-LPSS-2010, 81-84.