4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Segmentation of Spoken Dialogue by Interjections, Disfluent Utterances and Pauses

Kazuyuki Takagi (1), Shuichi Itahashi (2)

(1) The University of Electro-Communications, Tokyo, Japan
(2) University of Tsukuba, Tsukuba, Ibaraki, Japan

This paper attempts to segment spontaneous speech of human-to-human spoken dialogues into a relatively large unit of speech, that is, a sub-phrasal unit segmented by interjections, disfluent utterances and pauses. A spontaneous speech model incorporating prosody was developed, in which three kinds of speech segment models and the transition probabilities among them were specified. The segmentation experiments showed that 87.6 % of the segment boundaries were located correctly within 50 msec, 81.2 % within 30 msec, which showed 10.1 point increase in performance comparing with the initial model without prosodic information.

