Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
This paper describes a speech interface to an information retrieval system. It consists of three main components; speech recognition system, command generator and response generator. The speech recognition system accepts a spoken command of a Japanese sentence and passes the recognized sentence to the command generator, which translates it into a formal query command to operate the information retrieval system. The response generator receives retrieved data and produces a response to the user in a written sentence. We proposed a basic strategy in construction of the speech recognition system. It is that the top-down linguistic hypotheses are made at the lexical level while they are verified by using units independent of the word, phonetic strings bounded by robust phones (phones which can reliably be detected) in order to reduce the misrecognition of short function words. In the natural language interface, syntactic and semantic analyses are simultaneously performed. This makes it possible to resolve syntactic ambiguities. The interface was tested by using the speech corpus of 53 sentences spoken by each of three male speakers. The most promising rate of sentence understanding was 89.9 % for a small task.
Bibliographic reference. Niimi, Yasuhisa / Kobayashi, Yutaka (1992): "An information retrieval system with a speech interface", In ICSLP-1992, 1407-1410.