Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
This paper describes the development of a full-form lexicon, combined with an algorithm for quasi-morphological decomposition aiming at improved grapheme-to-phoneme conversion, word stress assignment, syllabification and word class assignment in a Text-to-Speech system. We will explain the way in which the optimal size of the lexicon was determined. Also, we describe a deterministic algorithm for decomposing words not found in the lexicon in terms of a sequence of lexicon entries and prefixes, suffixes and infixes. The performance of the lexicon+decomposition system is evaluated with a newspaper corpus comprising approximately 100,000 words. It appears that the system handles more than 95% of the regular words in the test corpus correctly. The system will have to be extended with a module that handles proper names.
Bibliographic reference. Gulikers, Leon / Willemse, Rijk (1992): "A lexicon for a text-to-speech system", In ICSLP-1992, 101-104.