Second European Conference on Speech Communication and Technology

Genova, Italy
September 24-26, 1991


Implementing Duration Expert Rules into a Text-to-Speech Synthesis System

L. Mortamet

Centre National d'Etudes des Telecommunications, (LAA/TSS/RCP), Lannion, France

We have used a natural speech database available at CNET to validate a theoretical duration model defined by a linguistic expert. This model has a hierarchical tree structure, each node corresponds to a particular syntactic-prosodic configuration to which an absolute duration value is assigned. This value is calculated by averaging the natural duration values for database entries, matching the considered configuration. The validation of this model had two goals: the first was to evaluate the validity and predictive power of the rules (i. e. to ensure that they define a minimal and complete rule set); the second was to validate the rules (i. e. to perceptually evaluate the quality of speech produced using them). A systematic perceptual test was done to compare natural phoneme durations and rule-calculated durations. The preliminary results seem globally correct but indicate the need for some local alterations, such as semi-vowel processing. These changes will allow us to refine the initial model using symbolic learning techniques. Keywords: speech synthesis, prosody, duration rules, natural speech database

