5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Parsers, Prominence, and Pauses

Nick Campbell, Tony Hebert, Ezra Black

ATR Interpreting Telecommunications Research Laboratories, Kyoto, Japan

We present results of a comparison between two prosody prediction algorithms, showing that the incorporation of information from a parser results in significantly improved performance for our text-to- speech synthesiser. We used a stochastic tree-based parser to generate a tagged and bracketed representation of the input text, and then interpreted this higher-level information to produce a ToBI-type prosodic annotation of the text. From this annotation an intonation contour was predicted for use in synthesising the speech. Results show that prediction of prosodic phrasing and focal prominence are improved by 56% and 62% respectively over previous methods compared against a human reading of the same test utterances.

Full Paper

Bibliographic reference.  Campbell, Nick / Hebert, Tony / Black, Ezra (1997): "Parsers, prominence, and pauses", In EUROSPEECH-1997, 979-982.