5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Emotional Speech Synthesis: From Speech Database to TTS

Juan Manuel Montero (1), Juana M. Gutierrez-Arriola (1), Sira Palazuelos (2), Emilia Enriquez (1), Santiago Aguilera (2), José Manuel Pardo (1)

(1) Grupo de Tecnologia del Habla;
(2) Laboratorio de Tecnologias de Rehabilitacion, Departamento de Ingenieria Electronica, E.T.S.I. Telecomunicación, Universidad Politecnica de Madrid, Spain

Modern Speech synthesisers have achieved a high degree of intelligibility, but can not be regarded as natural-sounding devices. In order to decrease the monotony of synthetic speech, the implementation of emotional effects is now being progressively considered. This paper presents a through study of emotional speech in Spanish, and its application to TTS, presenting a prototype system that simulates emotional speech using a commercial synthesiser. The design and recording of a Spanish database will be described and also the analysis of the emotional prosody (by fitting the data to a formal model). Using this collected data, a rule-based simulation of three primary emotions was implemented in the Text-to-Speech system. Finally, the assessment of the synthetic voice through perception experiments will classify the system as capable of producing quality voice with recognisable emotional effects.

Full Paper

Bibliographic reference.  Montero, Juan Manuel / Gutierrez-Arriola, Juana M. / Palazuelos, Sira / Enriquez, Emilia / Aguilera, Santiago / Pardo, José Manuel (1998): "Emotional speech synthesis: from speech database to TTS", In ICSLP-1998, paper 1037.