12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Uniform Speech Parameterization for Multi-Form Segment Synthesis

Alexander Sorin (1), Slava Shechtman (1), Vincent Pollet (2)

(1) IBM Research - Haifa, Israel
(2) Nuance Communications, Belgium

In multi-form segment synthesis speech is constructed by sequencing speech segments of different nature: model segments, i.e. mathematical abstractions of speech and template segments, i.e. speech waveform fragments. These multi-form segments can have shared, layered or alternate speech parameterization schemes. This paper introduces an advanced uniform speech parameterization scheme for statistical model segments and waveform segments employed in our multi-form segment synthesis system. Mel-Regularized Cepstrum derived from amplitude and phase spectra forms its basic framework. Furthermore, a new adaptive enhancement technique for model segments is presented that reduces the perceived gap in quality and similarity between model and template segments.

Full Paper

Bibliographic reference.  Sorin, Alexander / Shechtman, Slava / Pollet, Vincent (2011): "Uniform speech parameterization for multi-form segment synthesis", In INTERSPEECH-2011, 337-340.