13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Improved Formant Frequency Estimation From High-pitched Vowels by Downgrading the Contribution of the Glottal Source With Weighted Linear Prediction

Paavo Alku (1), Jouni Pohjalainen (1), Martti Vainio (2), Anne-Maria Laukkanen (3), Brad Story (4)

(1) Department of Signal Processing and Acoustics, Aalto University, Finland
(2) Department of Speech Sciences, University of Helsinki, Finland
(3) Department of Speech Communication and Voice Research, University of Tampere, Finland
(4) Speech Acoustics Laboratory, University of Arizona, USA

Since performance of conventional linear prediction (LP) deteriorates in formant estimation of high-pitched voices, several all-pole modeling methods robust to F0 have been developed. This study compares five such previously known methods and proposes a new technique, Weighted Linear Prediction with Attenuated Main Excitation (WLP-AME). WLP-AME utilizes weighted linear prediction in which the square of the prediction error is multiplied with a weighting function that downgrades the contribution of the glottal source in the model optimization. Consequently, the resulting all-pole model is affected more by the vocal tract characteristics, which leads to more accurate formant estimates. By using synthetic vowels created with a physical modeling approach, the study shows that WLP-AME yields improved formant frequency estimates for high-pitched vowels in comparison to the previously known methods.

Index Terms: formants, linear prediction

Full Paper

Bibliographic reference.  Alku, Paavo / Pohjalainen, Jouni / Vainio, Martti / Laukkanen, Anne-Maria / Story, Brad (2012): "Improved formant frequency estimation from high-pitched vowels by downgrading the contribution of the glottal source with weighted linear prediction", In INTERSPEECH-2012, 1612-1615.