INTERSPEECH 2006 - ICSLP
This paper focuses on acoustic modeling in speech recognition. A novel approach how to build grapheme based acoustic models with conversion from existing phoneme based acoustic models is proposed. The grapheme based acoustic models are created as weighted sum from monophone acoustic models. The influence of particular monophone is determined with the phoneme to grapheme confusion matrix. Further, the context-dependent acoustic models are being trained within the grapheme training procedure. The decision tree based clustering approach is used to tie similar states. A modified data-driven method for generation of grapheme broad classes needed during the initialization of decision tree is being applied. The data-driven broad classes are created using the grapheme based confusion matrix. All experiments were performed with the Slovenian language (1000 FDB SpeechDat(II) database), which is a highly inflectional language with no fixed set of rules for grapheme to phoneme conversion. The achieved results showed improvements of speech recognition results with the proposed methods.
Bibliographic reference. Zgank, Andrej / Kacic, Zdravko (2006): "Conversion from phoneme based to grapheme based acoustic models for speech recognition", In INTERSPEECH-2006, paper 1500-Wed1BuP.10.