4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
This paper analyzes the impact of German compound words on speech recognition. It is well known that, due to an idiosyncrasy of German orthography, compound words make up a major fraction of German vocabulary. And most Out-Of-Vocabulary (OOV) compounds are composed of frequent words already in the lexicon. This paper introduces a new method for handling the components of compounds rather than the compounds themselves. This not only reduces the vocabulary, and therefore the perplexity, but also improves word accuracy. And reduced perplexity means a more robust language model.
Bibliographic reference. Berton, André / Fetter, Pablo / Regel-Brietzmann, Peter (1996): "Compound words in large-vocabulary German speech recognition systems", In ICSLP-1996, 1165-1168.