Second European Conference on Speech Communication and Technology

Genova, Italy
September 24-26, 1991


Text-to-Speech Synthesizer Using Superposition of Sinusoidal Waves Generated by Synchronized Oscillators

Katsuhiko Shirai, K. Hashimoto, T. Kobayashi

Department of Electrical Engineering, Waseda University, Tokyo, Japan

A new speech synthesis method utilizing mutually synchronized oscillators is proposed and the possibilities of its application to text-to-speech systems are discussed. The voiced speech has the line spectrum structure and can be represented by the superposition of sinusoidal waves. In our system, these sinusoidal waves are generated by a group of mutually synchronized oscillators which are realized by numerical solutions of non-linear differential equations. This method has some characteristics as follows. (1) Voiced and voiceless sounds can be generated in a same framework to operate sinusoidal oscillators in parallel. (2) Since the phase and the power information of each sinusoidal wave can be easily controlled, if necessary, periodic waveforms in the voiced sounds can be precisely reproduced in the time domain. (3) The pitch frequency and phoneme duration can be easily changed without degradation of original sound quality.

Full Paper

Bibliographic reference.  Shirai, Katsuhiko / Hashimoto, K. / Kobayashi, T. (1991): "Text-to-speech synthesizer using superposition of sinusoidal waves generated by synchronized oscillators", In EUROSPEECH-1991, 39-42.