EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


F0 Feature Extraction by Polynomial Regression Function for Monosyllabic Thai Tone Recognition

Patavee Charnvivit, Somchai Jitapunkul, Visarut Ahkuputra, Ekkarit Maneenoi, Umavasee Thathong, Boonchai Thampanitchawong

Chulalongkorn University, Thailand

This paper presents a monosyllabic Thai tone recognition system. The system is composed of three processes, fundamental frequency (F0) extraction from input speech signal, analysis of F0 contour for feature extraction, and classification of each tone using the extracted features. In the F0 feature extraction, the polynomial regression functions are employed to fit the segmented F0 curve where its coefficients are used as a feature vector. In tone recognition, we used the maximum a posteriori probability classifier (MAP) to classify a tone. The vocabulary set is composed of the short vowel words, the long vowel words and have the effect of initial and final consonant on the shape of F0 contour. The experimental results show that by using the system as a speaker-dependent system, the maximum recognition rate is 96.20% using three-dimension feature vector. The speaker-independent recognition rates are 79.99% for male and 82.80% for female using four-dimension feature vector.

Full Paper

Bibliographic reference.  Charnvivit, Patavee / Jitapunkul, Somchai / Ahkuputra, Visarut / Maneenoi, Ekkarit / Thathong, Umavasee / Thampanitchawong, Boonchai (2001): "F0 feature extraction by polynomial regression function for monosyllabic Thai tone recognition", In EUROSPEECH-2001, 2753-2756.