First International Conference on Spoken Language Processing (ICSLP 90)
This paper describes an overview of our speech understanding system and reports on the recent results of the sentence recognition experiments. The system recognizes database queries in natural Japanese language spoken sentence by sentence. Based on a hierarchical architecture, the system predicts words strings in a top-down manner, however, the verifications proceeds irrelevantly to the word boundaries. Those phoneme strings bounded by the easily detectable phonemed are dynamically extracted from the predicted word string as verification templates. We carried out sentence recognition experiments using two different matching modules - the lattice matcher and the HMM matcher. The controller adopted a left-to-right time-synchronous beam-search strategy for searching likely sentences. The speech corpus consists of 159 sentences read by three Japanese male speakers. The task perplexity was 8.3. Using the speaker-dependent HMM parameters, the sentence recognition rates were 62.3~71.1 % for the lattice matcher and 83.0-92.5 % for the HMM matcher.
Bibliographic reference. Kobayashi, Yutaka / Niimi, Yasuhisa (1990): "Evaluation of a speech understanding system - suskit-2", In ICSLP-1990, 725-728.