4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

On Designing Pronunciation Lexicons for Large Vocabulary, Continuous Speech Recognition

Lori Lamel, Gilles Adda

Spoken Language Processing Group, LIMSI-CNRS, Orsay, France

Creation of pronunciation lexicons for speech recognition is widely acknowledged to be an important, but labor-intensive, aspect of system development. Lexicons are often manually created and make use of knowledge and expertise that is difficult to codify. In this paper we describe our American English lexicon developed primarily for the ARPA WSJ/NAB tasks. The lexicon is phonemically represented, and contains alternate pronunciations for about 10% of the words. Tools have been developed to add new lexical items, as well as to help ensure consistency of the pronunciations. Our experience in large vocabulary, continuous speech recognition is that systematic lexical design can improve system performance. Some comparative results with commonly available lexicons are given.

Full Paper
Sound Examples:   #1   #2   #3   #4   #5   #6  

Bibliographic reference.  Lamel, Lori / Adda, Gilles (1996): "On designing pronunciation lexicons for large vocabulary, continuous speech recognition", In ICSLP-1996, 6-9.