Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Robust parsing of N-best speech hypothesis lists using a general grammar-based language model

Manny Rayner (1), Peter Wyard (2)

(1) SRI International, Suite 23, Millers Yard, Cambridge, UK
(2) BT Laboratories, Martlesham Heath, Ipswich, UK

We describe a series of experiments designed to investigate the feasibility of using a general linguistically motivated grammar of English to improve the language model of a speech recognizer. A largely automatic corpus-based method was used to convert the general grammar into a specialised version tuned to the domain. This was then used to parse N-best speech hypothesis lists produced by a recognizer, using an algorithm which optionally allowed deletions or substitutions at the beginings and ends of utterances. Competing robust analyses were scored using a weighted combination of several corpus-based preference functions. The sentence accuracy of the recognizer improved from 34.5% to 39%, on a metric which regarded close variants of the reference sentence as successes.

Full Paper

Bibliographic reference.  Rayner, Manny / Wyard, Peter (1995): "Robust parsing of n-best speech hypothesis lists using a general grammar-based language model", In EUROSPEECH-1995, 1793-1796.