EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Support Vector Machine with Dynamic Time-Alignment Kernel for Speech Recognition

Hiroshi Shimodaira (1), Ken-ichi Noma (1), Mitsuru Nakai (1), Shigeki Sagayama (2)

(1) Japan Advanced Institute of Science and Technology, Japan
(2) University of Tokyo, Japan

A new class of Support Vector Machine (SVM) which is applicable to sequential-pattern recognition is developed by incorporating an idea of non-linear time alignment into the kernel. Since time-alignment operation of sequential pattern is embedded in the kernel evaluation, same algorithms with the original SVM for training and classification can be employed without modifications. Furthermore, frame-wise evaluation of kernel in the proposed SVM (DTAK-SVM) enables frame-synchronous recognition of sequential pattern, which is suitable for continuous speech recognition. Preliminary experiments of speaker-dependent 6 voiced-consonants recognition demonstrated excellent recognition performance of more than 98% in correct classification rate, whereas 93% by hidden Markov models (HMMs).

Full Paper

Bibliographic reference.  Shimodaira, Hiroshi / Noma, Ken-ichi / Nakai, Mitsuru / Sagayama, Shigeki (2001): "Support vector machine with dynamic time-alignment kernel for speech recognition", In EUROSPEECH-2001, 1841-1844.