5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Text Segmentation and Topic Tracking on Broadcast News Via a Hidden Markov Model Approach

Paul van Mulbregt, Ira Carp, Lawrence Gillick, Steve Lowe, Jon Yamron

Dragon Systems, Inc., USA

Expertise in the automatic transcription of broadcast speech has progressed to the point of being able to use the resulting transcripts for information retrieval purposes. In this paper, we first describe a corpus of automatically recognized broadcast news, a method for segmenting the broadcast into stories, and finally apply this method to retrieve stories relating to a specific topic. The method is based on Hidden Markov Models and is in analogy with the usual implementation of HMMs in speech recognition.

Full Paper

Bibliographic reference.  Mulbregt, Paul van / Carp, Ira / Gillick, Lawrence / Lowe, Steve / Yamron, Jon (1998): "Text segmentation and topic tracking on broadcast news via a hidden Markov model approach", In ICSLP-1998, paper 0116.