5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

A Discourse Coding Scheme for Conversational Spanish

Lori Levin (1), Ann Thyme-Gobbel (2), Alon Lavie (1), Klaus Ries (1), Klaus Zechner (1)

(1) Carnegie Mellon University, USA
(2) Natural Speech Technologies, USA

This paper describes a 3-level manual discourse coding scheme that we have devised for manual tagging of the CallHome Spanish (CHS) and CallFriend Spanish (CFS) databases used in the CLARITY project. The goal of CLARITY is to explore the use of discourse structure in understanding conversational speech. The project combines empirical methods for dialogue processing with state-of-the art LVCSR (using the JANUS recognizer). The three levels of the coding scheme are (1) a speech act level consisting of a tag set extended from DAMSL and Switchboard; (2) dialogue game level defined by initiative and intention; and (3) an activity level defined within topic units. The manually tagged dialogues are used to train automatic classifiers. We present preliminary results for automatic speech act classification and topic boundary identification and inter-coder speech act confusion matrices.

Full Paper

Bibliographic reference.  Levin, Lori / Thyme-Gobbel, Ann / Lavie, Alon / Ries, Klaus / Zechner, Klaus (1998): "A discourse coding scheme for conversational Spanish", In ICSLP-1998, paper 1000.