4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Formant Analysis Using Mixtures of Gaussians

Parham Zolfaghari, Tony Robinson

Cambridge University Engineering Department, Cambridge, UK

This paper describes a new formant analysis technique whereby the formant parameters are represented in the form of Gaussian mixture distributions. These are estimated from the Discrete Fourier Transform (DFT) magnitude spectrum of the speech signal. The parameters obtained are the means, variances and the masses of the density functions, which are used to calculate centre frequencies, bandwidths and amplitudes of formants within the spectrum. In order to better fit the mixture distributions various modifications to the DFT magnitude spectrum, based on simple models of perception, were investigated. These include reduction of dynamic range, cepstral smoothing, use of the Mel scale and pre-emphasis of speech. Results are presented for these as well as formant tracks from analysing speech using the final formant analysis system.

Full Paper

Bibliographic reference.  Zolfaghari, Parham / Robinson, Tony (1996): "Formant analysis using mixtures of Gaussians", In ICSLP-1996, 1229-1232.