EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Automatic Learning of Finite State Automata for Pronunciation Modeling

M. Pastor-i-Gadea, F. Casacuberta

Universitat Politècnica de València, Spain

The great variability of word pronunciations in spontaneous speech is one of the reasons for the low performance of present speech recognition systems. The generation of dictionaries that take into account this variability can increase the robustness of such systems. A word pronunciation is a possible phone sequence that can appear in a real utterance, and represents a possible acoustic realization of the word. Here, word pronunciations are modeled using finite state automata. The use of such models allow for the application of grammatical inference methods and an easy integration with the others sources of acknowledge. The training samples are obtained from the alignment between the phone decodification of each training utterance and the corresponding canonical transcription. Models proposed in this work were applied in a translation-oriented speech task. The improvements achieved by these models were in the range between 2.7 to 0.6 points depending on the language model used.

Full Paper

Bibliographic reference.  Pastor-i-Gadea, M. / Casacuberta, F. (2001): "Automatic learning of finite state automata for pronunciation modeling", In EUROSPEECH-2001, 2297-2300.