Topic spotting with whole-word models has been shown to give high detection rate with low false alarms. However the system must be capable of the generation of keyword models without repeated data collection sessions to be flexible in use. Since the required vocabulary is unknown a priori the models must be task independent. This however degrades the system performance which then must be restored. This can be achieved by using linear discriminant analysis in the feature extractor and the generation of context dependent subword models using decision trees. The system uses concatenations of the context dependent models to form the keyword models. Keywords are selected according to their usefulness. Non-keyword speech is modelled by a set of monophone models. During topic spotting the significance of the occurrence of keywords is weighted according to the discrimination they provide between topic and non-topic material. The system was tested on the BBC database spotting two minute weather forecasts. It detected 95% of the weather forecasts at a rate of one false alarm per hour. It was also tested on three other topics where its performance was not as good but still useful. Keywords: topic spotting, word-spotting, context dependent models, linear discriminant analysis.
Bibliographic reference. Carey, Michael J. / Parris, Eluned S. (1995): "Topic spotting with task independent models", In EUROSPEECH-1995, 2133-2136.