4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
One of the key problems for large-vocabulary ASR is the detection of unknown or misrecognized portions of the input. This paper presents results obtained using a local rejection algorithm. The algorithm is derived from the two-pass recognition algorithm by Murveit  and is used to detect misrecognized portions based on the number per frame of active words during the second pass. The hypothesis underlying the algorithm is that recognition on unexpected data, i.e. noise or out-of-vocabulary (OOV) words, is likely to result in activation of more words, since no word matches the data well; on the other hand, when the match is good, fewer words should be active. The algorithm was tried on part of the WSJ 5K November 1993 test, in which there were no OOV words (3370 words in total) and on the digit-strings-only Macrophone data (14686 words of which 895 were OOV). The results obtained indicate that our approach is promising, both for the detection of OOV words and misrecognized portions of the input. It may provide the base on which to build tools for dealing with these phenomena. These tools might include dialogue mechanisms based on the list of activated words corresponding to a rejected portion, display mechanisms such as reverse video or rescoring schemes.
Bibliographic reference. Lacouture, Roxane / Normandin, Yves (1996): "Detection of ambiguous portions of signal corresponding to OOV words or misrecognized portions of input", In ICSLP-1996, 2071-2074.