4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
The development of a parser for a Norwegian text-to-speech system is reported. The Generalized Left Right (GLR) algorithm  is applied, which is a generalization of the well known LR algorithm for parsing computer languages. This paper describes briefly the GLR algorithm, the integration of a probabilistic scoring model, our implementation of the parser in C++, attribute structures, lexical interface, and the application of the parser to part-of-speech (POS) tagging for Norwegian. Applied to a small test set of about 4 000 words this method correctly tags 96 % of the known words, which is close to the performance of other POS-taggers trained on large text databases  . 85 % of the unknown words are tagged correctly, and the probability of choosing the wrong pronunciation of a word from lexicon is less than 0.1 %.
Bibliographic reference. Heggtveit, Per Olav (1996): "A generalized LR parser for text-to-speech synthesis", In ICSLP-1996, 1429-1432.