Hierarchical Recurrent Neural Network for Story Segmentation

Emiru Tsunoo, Peter Bell, Steve Renals


A broadcast news stream consists of a number of stories and each story consists of several sentences. We capture this structure using a hierarchical model based on a word-level Recurrent Neural Network (RNN) sentence modeling layer and a sentence-level bidirectional Long Short-Term Memory (LSTM) topic modeling layer. First, the word-level RNN layer extracts a vector embedding the sentence information from the given transcribed lexical tokens of each sentence. These sentence embedding vectors are fed into a bidirectional LSTM that models the sentence and topic transitions. A topic posterior for each sentence is estimated discriminatively and a Hidden Markov model (HMM) follows to decode the story sequence and identify story boundaries. Experiments on the topic detection and tracking (TDT2) task indicate that the hierarchical RNN topic modeling achieves the best story segmentation performance with a higher F1-measure compared to conventional state-of-the-art methods. We also compare variations of our model to infer the optimal structure for the story segmentation task.


 DOI: 10.21437/Interspeech.2017-392

Cite as: Tsunoo, E., Bell, P., Renals, S. (2017) Hierarchical Recurrent Neural Network for Story Segmentation. Proc. Interspeech 2017, 2919-2923, DOI: 10.21437/Interspeech.2017-392.


@inproceedings{Tsunoo2017,
  author={Emiru Tsunoo and Peter Bell and Steve Renals},
  title={Hierarchical Recurrent Neural Network for Story Segmentation},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={2919--2923},
  doi={10.21437/Interspeech.2017-392},
  url={http://dx.doi.org/10.21437/Interspeech.2017-392}
}