A probabilistic approach to lexical access from a recognized phone lattice is presented. Lexical access is seen as finding the overall likelihood of a sequence of phones and durations for given words. Finding the word sequence that maximizes this likelihood combined with priors obtained from a language model comprises the overall recognition strategy. The likelihood computed in lexical access is a combination of the acoustic likelihoods obtained from a phone recognizer and lexical likelihoods, which represent phone realization and duration likelihoods for given word sequences. Classification trees are used to estimate the phone realiziation distributions and regression trees are used to estimate the phone duration distributions. We find they can capture effectively allophonic variation, alternative pronunciation, word coarticulation and segmental durations. We describe a simpified, but efficient implementation of these models to lexical access in the DARPA resource management recognition task.
Bibliographic reference. Riley, Michael D. / Ljolje, Andrej (1991): "Lexical access with a statistically-derived phonetic network", In EUROSPEECH-1991, 585-588.