Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
This paper reports on a study of the relationship between acoustic-prosodic variation and discourse structure, as determined from an independent model of discourse. We present results of two pilot studies. Our corpus consisted of three AP news stories recorded by a professional speaker. Discourse structure was labeled by subjects either from text alone or from text (with all orthographic markings except sentence-final punctuation removed) and speech, following Grosz & Sidner 1986; average inter-labeler agreement for structural elements varied from 74.3%-95.1%, depending upon feature. These elements of global structure, together with elements of local structure such as parentheticals and attributive tags, were correlated with variation in intonational and acoustic features such as pitch range, contour, timing, and amplitude. We found statistically significant associations between aspects of pitch range, amplitude, and timing with features of global and local structure both for labelings from text alone and for labelings from speech. We further found that global and local structures can be reliably identified from acoustic and prosodic features with (cross-validated) success rates of 86-97%.
Bibliographic reference. Grosz, Barbara / Hirschberg, Julia (1992): "Some intonational characteristics of discourse structure", In ICSLP-1992, 429-432.