The paper explores the use of a recurrent neural net to perform lexical access. It is trained to effect a mapping from the phonemic output of a neural net front end (reported in ), to lexical items. A cascaded multi-level approach is used which allows common phonetic variation in words to be captured by the front end. In order to keep the context duration manageable, non-linear time compression is performed on the input data to the lexical access network, such that transitional segments are retained, while steady state probability segments are much shortened, but duration information is not completely obscured. A post-processor, based on a simple finite state automaton, is used to decode the raw NN output, which takes the form of a word pseudo-probability vector (and is synchronous with the input) to produce a symbolic (i. e. orthographic) interpretation of the input speech. Encouraging results using digit strings are presented.
Bibliographic reference. Russell, N. H. / Fallside, Frank / Robinson, A. J. / Prager, R. W. (1991): "Lexical access using a recurrent error propagation network", In EUROSPEECH-1991, 1023-1026.