Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Unconstrained Speech Retrieval for Chinese Document Databases With Very Large Vocabulary and Unlimited Domains

Sung-Chien Lin (1), Lee-Feng Chien (2), Keh-Jiann Chen (2), Lin-Shan Lee (1,2)

(1) Dept. of Computer Science and Information Engineering, National Taiwan University
(2) Institute of Information Science, Academia Sinica, Taipei, Taiwan, Republic of China

This paper presents a new approach for Chinese document database retrieval with unconstrained speech-input queries. This approach is based on a successful integration of both speech recognition and information retrieval technologies with special considerations of the characteristics of the Chinese language. Such an approach is especially important for Chinese language, because Chinese language is not alphabetic and the input of Chinese characters into computers is still a very difficult and unsolved problem. The nice features of the syllable-based approach include proper use of the knowledge acquired from database to provide grammatical constraints for speech recognition, and the tolerance of speech recognition errors by approximate text matching. Based on this approach, a prototype system is implemented and encouraging experimental results are demonstrated.

Full Paper

Bibliographic reference.  Lin, Sung-Chien / Chien, Lee-Feng / Chen, Keh-Jiann / Lee, Lin-Shan (1995): "Unconstrained speech retrieval for Chinese document databases with very large vocabulary and unlimited domains", In EUROSPEECH-1995, 1203-1206.