12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Speaker State Classification Based on Fusion of Asymmetric SIMPLS and Support Vector Machines

Dong-Yan Huang (1), Shuzhi Sam Ge (2), Zhengchen Zhang (2)

(1) A*STAR, Singapore
(2) National University of Singapore, Singapore

This paper describes a Speaker State Classification System (SSCS) for the INTERSPEECH 2011 Speaker State Challenge. Our SSC system for the Intoxication and Sleepiness Sub-Challenges uses fusion of several individual sub-systems. We make use of three standard feature sets per corpus given by organizers. Modeling is based on our own developed classification method . Asymmetric simple partial least squares (ASIMPLS) and Support Vector Machines (SVMs), followed by the calibration and multiple fusion methods. The advantage of asymmetric SIMPLS is prone to protect the minority class from being misclassified and boosts the performance on the majority class. Our experimental results show that our SSC system performs better than baseline system. Our final fusion results in 1.8% absolute improvement on the unweighted accuracy value for the Alcohol Language Corpus (ALC) and about 0.7% for the Sleepy Language Corpus (SLC) on the development set over the baseline. On the test set, we obtain 1.1% and 1.4% absolute improvement, respectively.

Full Paper

Bibliographic reference.  Huang, Dong-Yan / Ge, Shuzhi Sam / Zhang, Zhengchen (2011): "Speaker state classification based on fusion of asymmetric SIMPLS and support vector machines", In INTERSPEECH-2011, 3301-3304.