5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Speech Synthesis and Prosody Modification Using Segmentation and Modelling of the Excitation Signal

Juana M. Gutierrez-Arriola, Francisco M. Gimenez de los Galanes, Mohammed H. Savoji, Josť M. Pardo

Grupo de Tecnologia del Habla, Departamento de Ingenieria Electronica, E.T.S.I. Telecomunicacion, Universidad Politecnica de Madrid, Spain

In previous work we have presented a new method for improving the quality of LPC synthetic speech, where the excitation signal was modelled with a polynomial function followed by an adaptive filter. This scheme provides the properties of mathematical models which permits avoiding the problems related to prosody control [1], [2]. In order to reduce the storage needs, a segmentation technique was developed which grouped together several pitch periods based on spectral similarity. For every segment the same coefficient set (both the polynomial function and the post-processing filter) was used. These techniques were applied to a codification/decodification task were the resulting speech quality was promising [1], [2]. In this paper we present some results concerning prosodic modification, i.e. duration and fundamental frequency arbitrary changes which show the suitability of these methods for text-to-speech applications. We also present some results of the extension of the model to unvoiced segments of speech.

Full Paper

Bibliographic reference.  Gutierrez-Arriola, Juana M. / Gimenez de los Galanes, Francisco M. / Savoji, Mohammed H. / Pardo, Josť M. (1997): "Speech synthesis and prosody modification using segmentation and modelling of the excitation signal", In EUROSPEECH-1997, 1059-1062.