DiSS-LPSS Joint Workshop 2010

The 5th Workshop on Disfluency in Spontaneous Speech
The 2nd International Symposium on Linguistic Patterns in Spontaneous Speech

Tokyo, Japan, September 25-26, 2010

An Annotation Scheme for Syntactic Unit in Japanese Dialog

Takehiko Maruyama (1), Katsuya Takanashi (2), Nao Yoshida (1)

(1) National Institute for Japanese Language and Linguistics, Japan
(2) Academic Center for Computing and Media Studies, Kyoto University, Japan

In this paper, we propose a scheme for annotating syntactic units called DCU (Dialog Clause-Unit) in Japanese dialogs. Since there is no explicit devices to mark sentence boundaries in speech, precise definition and criteria must be designed to extract syntactic units from the utterance. We show a design of DCU which consists of clausal and non-clausal units. Annotating DCU tags to eight dialogs of 40 minutes from two different dialog corpora, we examine characteristics of each dialog from the viewpoint of DCU, and compare them to the distribution of clausal-units annotated to monologs.

Index Terms. Dialog Clause-Unit, Japanese dialog and monolog, clause boundary, unit length

Full Paper

Bibliographic reference.  Maruyama, Takehiko / Takanashi, Katsuya / Yoshida, Nao (2010): "An annotation scheme for syntactic unit in Japanese dialog", In DiSS-LPSS-2010, 51-54.