Sixth International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA 2009)
A novel approach for estimation of speaker specific vocal tract properties is presented in this paper. Instead of using the well-known long-term average spectrum (LTAS) of speech, it is shown that the variance of the magnitude of the spectrum in each band is also suitable for estimation of formant frequencies. This representation, called mean spectral variance (MSV), is applied to an automatic gender classification task, where it is shown to achieve good classification accuracy in combination with the fundamental frequency of speech. The MSV is compared with LTAS and their similarities and differences are discussed.
Index Terms. Formant estimation, gender classification, long-term feature averaging
Full Paper (reprinted with permission from Firenze University Press)
Bibliographic reference. Laine, Unto K. / Räsänen, O. J. (2009): "Indirect estimation of formant frequencies through mean spectral variance with application to automatic gender recognition", In MAVEBA-2009, 111-114.