Second European Conference on Speech Communication and Technology

Genova, Italy
September 24-26, 1991


An Analysis of Strategies for Finding Prosodic Clues in Text

Michael H. O'Malley, Howard Resnick, Michelle Caisse

Berkeley Speech Technologies, Berkeley, California, USA

The purpose of this paper is to evaluate the potential benefits of associating certain text phenomena with certain prosodic effects in a text-to-speech (TTS) system. Samples of text from two common TTS applications - interactive electronic messages and expository news articles - were collected. The frequency of these phenomena was measured for each of the two styles of text and a judgement was made as to the appropriateness of the associated prosodic effect. In electronic mail, 7 of the 11 phenomena studied would have produced the appropriate prosody more often than about once per 1000 words of text. In news text, none of the phenomena occurred that often. The effect of implementing the reliable prosody assignment rules would be to improve prosody an average of about once every 100 words for e-mail and once every 880 words for news text. A simple nuclear accent placement rule was also evaluated. Nuclear accent was located incorrectly on 22% of the intonational phrases. Most of these errors were due to compounds rather than to more complex discourse phenomena. Accurate compound and two-word verb detection could improve prosody an average of once every 60 words for both styles of text.

Full Paper

