First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

Speaker-Independent Word Spotting and a Transputer-Based Implementation

Akihiro Imamura, Yoshitake Suzuki

NTT Human Interface Laboratories, Kanagawa, Japan

This paper describes an HMM-based speaker-independent word spotting system and its Transputer-based implementation. The candidates of word end-points and the corresponding likelihood scores are computed with the continuous Viterbi decoding algorithm. To prune unreasonable candidates, a new duration control method, a threshold logic for the likelihood scores and a new local peak detection method are proposed. An efficient parallel processing scheme for the word spotting system is carried out by using a tree structure of Transputers. In each frame period, the spectral feature vector from the speech analyzer is broadcasted from the root Transputer (Processing Master: PM) to the node Transputers (Processing Element : PE). Each PE performs the continuous Viterbi decoding and the pruning of candidates in parallel, and the spotting results are returned to PM. With 8 PEs in a tree structure, 72 words can be processed within a 12msec frame period. Word detection experiments, using the 10 Japanese digits spoken over a noisy telephone network, yield a word detection accuracy of 97%.

