EUROSPEECH '97

This paper describes a low bitrate segmental formant vocoder. The formants are estimated using mixture of Gaussians whose means are constrained to vary linearly with time within a segment. A new method of smoothing the power spectrum has been used in order to improve modelling with mixtures of Gaussians. Pitch is estimated using the autocorrelation function, and voicing is detected using the autocorrelation function method and the energy in the spectrum. Optimal segment boundaries are obtained using a dynamic programming procedure based on the power normalised loglikelihood of the segment. Magnitudeonly sinusoidal synthesis is then used to synthesise speech from the estimated spectrum. Using multiple codebooks an average bitrate of 500 bps has been obtained.
Bibliographic reference. Zolfaghari, Parham / Robinson, Tony (1997): "A segmental formant vocoder based on linearly varying mixture of Gaussians", In EUROSPEECH1997, 425428.