4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
We present a continuous speech recognition architecture with a tightly coupled language model that tries to improve the dwindling performance of the normal stack decoder with increasing lexicon size. We solve the problem of recognition by means of two mutually recursive functions. The first one uses an auxiliary retrieval function to obtain lexicalized (already built) solutions to the problem, and merges these solutions with the ones built by the second function. This second one describes the acoustical and semantic recognition process as a search problem defined with the help of the first function, and solved with the help of the A* strategy. As a linguistic model, we use a hierarchy of linguistic levels each of which has a particular meaning structure, a lexicon of lexicalized forms, their lexicalization probabilities, and a local lexical grammar describing how the semantic categories of the level can be built. The process can further be optimized if targets, constraints on the possible solutions, are given to the recognition process to guide and restrict it. Target guidance implies a mechanism for target focusing, locally matching targets to the recognition state, and target prediction with the help of a lexical local grammar. We are testing the architecture in a DARPA RM-like application.
Bibliographic reference. Valverde-Albacete, Francisco J. / Pardo, Josť M. (1996): "A multi-level lexical-semantics based language model design for guided integrated continuous speech recognition", In ICSLP-1996, 224-227.