4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Word Class Driven Synthesis of Prosodic Annotations

Simon Arnfield

Department of Linguistic Science, The University of Reading, Whiteknights, Reading, UK

Prosody is an important aspect of speech that current text to speech synthesis systems fail to mimic in a convincing or natural way [1,2,3,4]. This paper describes research on a partial system for prosodic synthesis using easily derived low level syntactic information. A computer program has been developed that can annotate unseen text with prosodic stress and tone marks using the sequence of part of speech tags previously assigned to each word by a tagging system. Training and testing material was taken from the Lancaster/IBM Spoken English Corpus (SEC). Co-occurrence measures were calculated relating stress and tone mark annotations to the word class annotation information. A model was developed around the statistical information which calculates a score for all possible mappings between a given part of speech sequence and all the potential stress/tone annotations. The highest scoring pattern is selected as that which is the most likely \baseline" annotation, according to the model. Performance figures attain up to 91% agreement with the original corpus annotations.

Full Paper

Bibliographic reference.  Arnfield, Simon (1996): "Word class driven synthesis of prosodic annotations", In ICSLP-1996, 1978-1980.