14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Formant Frequency Tracking Using Gaussian Mixtures with Maximum a Posteriori Adaptation

Jonathan C. Kim, Hrishikesh Rao, Mark A. Clements

Georgia Institute of Technology, USA

We present a novel method for estimating formant frequencies by fitting Gaussian mixtures to discrete Fourier Transform (DFT) magnitude spectra. The method first estimates the Gaussian parameters for a sequence of wideband spectra using the Expectation- Maximization (EM) algorithm. It then refines the parameters by using maximum a posteriori (MAP) adaptation. The work was evaluated using manually labeled ground truth data with 516 utterances and comparing results both with PRAAT's formant tracking algorithm in various noisy environments and one other state-of-the-art method. We obtained statistically significant improvements in the relative errors for the first three formants over all phonetic classes.

Full Paper

Bibliographic reference.  Kim, Jonathan C. / Rao, Hrishikesh / Clements, Mark A. (2013): "Formant frequency tracking using Gaussian mixtures with maximum a posteriori adaptation", In INTERSPEECH-2013, 3221-3225.