Second International Conference on Spoken Language Processing (ICSLP'92)

Banff, Alberta, Canada
October 13-16, 1992

Automatic Derivation of Lexical Models for a Very Large Vocabulary Speech Recognition System

R. Roddeman, H. Drexler, Louis Boves

Nijmegen University, Dept. of Language & Speech, Nijmegen, The Netherlands

It is well known [2] that the lexicon of any large vocabulary dictation system will have to be tuned to the end user's application. This would be no great problem if lexical entries in standard orthographic or phonemic representation were adequate. However, it appears that recognition performance is much improved if the lexical models can be specified so as to be in accordance with the peculiarities of the Acoustic Front-End (AFE) of the recognizer. This makes updating the lexicon a tedious, error prone process, that requires expertise far beyond what can be expected from the average end user. In a similar vein, during the initial development of a large vocabulary dictation system the recognition performance can be much improved if the lexical models are well adapted to the AFE. Here too, the creation of the lexicon can become almost prohibitive if it must be done completely by hand- For these reasons we have developed a completely automatic procedure for adapting the lexicon to the AFE. We have used this tool in the conversion of an existing Isolated Word Speech Recognition from Italian [1] to Dutch. Use of the tool proved to speed up the development of the systems enormously. The work reported here was done in the framework of the ESPRIT Project POLYGLOT. keywords: Isolated-Word-Recognition preselection lexicon

