13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

On the Use of Non-Linear Polynomial Kernel SVMs in Language Recognition

Sibel Yaman, Jason Pelecanos, Mohamed Kamal Omar

IBM T. J. Watson Research Labs, Yorktown Heights, NY, USA

Low-dimensional representations have been shown to outperform their supervector counterparts in a variety of speaker recognition tasks. In this paper, we show that non-linear polynomial kernel support vector machines (SVMs) trained with low-dimensional representations almost halve the equal-error rate (EER) of the best performing SVMs trained with supervectors. Non-linear kernel SVMs implicitly transform the input features onto higher-dimensional spaces, a mechanism known to be generally effective when the number of instances is much larger than the feature dimension. Contrary to linear kernels, non-linear kernels exploit the dependencies among different input feature dimensions in the resulting high-dimensional spaces. Our experiments demonstrate that fifth-order polynomial kernel SVMs trained with low-dimensional representations reduce the EER by 56% relative when compared to standard linear SVMs trained with supervectors. They reduce the EER by 40% relative to the best performing SVMs trained with supervectors.

Index Terms: language recognition, support vector machines

Full Paper

Bibliographic reference.  Yaman, Sibel / Pelecanos, Jason / Omar, Mohamed Kamal (2012): "On the use of non-linear polynomial kernel SVMs in language recognition", In INTERSPEECH-2012, 2053-2056.