Second International Conference on Spoken Language Processing (ICSLP'92)

Banff, Alberta, Canada
October 13-16, 1992

A Comparison of Statistical and Rule Based Methods of Determining Segmental Durations

Andrew P. Breen

BT Laboratories, Martlesham Heath, Ipswich, England

It has been shown [1] that the duration of phonetic segments is an important prosodic factor in the production of natural sounding synthetic speech. It has also been shown [2] that the durations of phonetic segments are affected by a number of factors such as stress, phrase boundaries, phonetic context and speaking rate. In attempting to predict the behaviour of segmental durations, many researchers have concentrated on one or other of two differing approaches; Rule based methods and statistical methods [3]: rule based methods attempt to explicitly model factors known to affect segmental durations, while statistical methods rely solely on the brute force tabulation of large amounts of annotated data. This paper compares the performance of a statistically based method of determining phonetic segment duration, currently being developed at BT laboratories, with an implementation of the duration rules developed by D. Klatt [4][5]. The performance of each method will be compared against hand annotated durations obtained from a large body of test phrases spoken by a number of different speakers.

Full Paper

Bibliographic reference.  Breen, Andrew P. (1992): "A comparison of statistical and rule based methods of determining segmental durations", In ICSLP-1992, 1199-1202.