5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Topic Recognition for News Speech Based on Keyword Spotting

Yoichi Yamashita (1), Toshikatsu Tsunekawa (2), Riichiro Mizoguchi (2)

(1) Dep. of Computer Science, Ritsumeikan University, Japan
(2) I.S.I.R., Osaka University, Japan

This paper describes topic identification for Japanese TV news speech based on the keyword spotting technique. Three thousands of nouns are selected as keywords which contribute to topic identification, based on criterion of mutual information and a length of the word. This set of the keywords identified the correct topic for 76.3% of articles from newspaper text data. Further, we performed keyword spotting for TV news speech and identified the topics of the spoken message by calculating possibilities of the topics in terms of an acoustic score of the spotted word and a topic probability of the word. In order to neutralize effect of false alarms, bias of the topics in the keyword set is removed. Topic identification rate is 66.5% assuming that identification is correct if the correct topic is included in the top three topics. The removal of the bias improved the identification rate by 6.1%.

Full Paper

Bibliographic reference.  Yamashita, Yoichi / Tsunekawa, Toshikatsu / Mizoguchi, Riichiro (1998): "Topic recognition for news speech based on keyword spotting", In ICSLP-1998, paper 0023.