Text-to-speech synthesizer is described, based on the the concatenation of the Polish diphones. The text-to-phoneme conversion is based on the neural network. Diphones are extracted and stored pitch- synchronously, using the variable rate linear predictive coder with mixed excitation. The pitch period modification is based on the time- domain interpolation of the excitation signal. Duration is controlled by insertion of the pitch periods and interpolation of the excitation signal. Preliminary results are reported.
Bibliographic reference. Dymarski, Przemyslaw / Kuklinski, Slawomir / Kula, Siawomir (1995): "A text-to-speech synthesizer for the Polish language", In EUROSPEECH-1995, 1101-1104.