Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Spontaneous Thai Speech Recognition

Monika Woszczyna (1,2), Paisarn Charoenpornsawat (2), Tanja Schultz (2)

(1) Multimodal Technologies Inc., USA; (2) Carnegie Mellon University, USA

This paper expands previous work on Thai speech recognition, investigating pronunciation changes such as syllable and phoneme elisions as well as phoneme shifts in Thai spontaneous speech. We compare several approaches to model these effects in large vocabulary continuous speech recognition across multiple domains. This work includes experiments on two new speech databases that significantly alleviate the data sparseness problem of earlier publications. We found that given sufficient training data, a fully data driven approach using an allophone cluster tree yields the best results. Explicit modeling of pronunciation changes does not improve performance across domains.

Full Paper

Bibliographic reference.  Woszczyna, Monika / Charoenpornsawat, Paisarn / Schultz, Tanja (2006): "Spontaneous Thai speech recognition", In INTERSPEECH-2006, paper 1419-Wed2CaP.8.