ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

An efficient method to estimate pronunciation from multiple utterances

Tofigh Naghibi, Sarah Hoffmann, Beat Pfister

Given K utterances of a word and a set of sub-word units one may need a generalization of the conventional one-dimensional Viterbi algorithm to jointly decode them in order to derive their underlying word model (pronunciation). This extension is called k-dimensional Viterbi. However, as the number of utterances increases, the complexity of the k-dimensional Viterbi algorithm exponentially increases causing prohibitive computational burden. Here, we propose an approximation algorithm for the k-dimensional Viterbi which efficiently uses the available utterances to estimate the pronunciation. In addition to automatic dictionary generation, it can be used in computationally expensive applications such as lexicon-free training and joint pattern alignment.

doi: 10.21437/Interspeech.2013-465

Cite as: Naghibi, T., Hoffmann, S., Pfister, B. (2013) An efficient method to estimate pronunciation from multiple utterances. Proc. Interspeech 2013, 1951-1955, doi: 10.21437/Interspeech.2013-465

  author={Tofigh Naghibi and Sarah Hoffmann and Beat Pfister},
  title={{An efficient method to estimate pronunciation from multiple utterances}},
  booktitle={Proc. Interspeech 2013},