First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

ATR Dialogue Database

Terumasa Ehara (1), Kentaro Ogura (2), Tsuyoshi Morimoto (1)

(1) ATR Interpreting- Telephony Research Laboratories, Kyoto, Japan
(2) NTT Communications and Information Processing Laboratories, Kanagawa, Japan

Abstract We are constructing" a dialogue database called the ATR Dialogue Database (ADD) as the basic data to study an automatic interpreting telephony system. ADD is a large structured database of dialogues collected from simulated telephone or keyboard conversations which are spontaneously spoken or typed in Japanese or English. The corpus collected in one language is manually translated/interpreted to the other language and the correspondences of these two corpora are made by several linguistic units. We compared telephone dialogues and keyboard dialogues in ADD. From the examination of the experiment results, we can conclude that, except for the following items, linguistic phenomena in telephone dialogues are almost the same of those in keyboard dialogues. The phenomena peculiar to telephone dialogues are the existence of interjections, restatements, fragmental sentences, long complex sentences, redundant expressions and indirect expressions.

Full Paper

Bibliographic reference.  Ehara, Terumasa / Ogura, Kentaro / Morimoto, Tsuyoshi (1990): "ATR dialogue database", In ICSLP-1990, 1093-1096.