ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Probabilistic trainable segmenter for call center audio using multiple features

Nina Zinovieva, Xiaodan Zhuang, Pat Peterson, Joe Alwan, Rohit Prasad

An important component of customer call experience analysis is to distinguish different segments of a call including interactive voice response (IVR), waiting in queue, and interaction with an agent. Because segment information from telephone switches is not always available, or may be difficult to obtain, we sought a method that could perform such segmentation solely from the recorded audio. In this paper, we present a probabilistic framework for segmenting call center audio into IVR, Queue, and Agent using a suite of rich features based on both speech and non-speech content. We study different statistical classifiers such as Maximum Entropy (MaxEnt) and Conditional Random Field (CRF). We present experimental results on real-world call center data and demonstrate that the probabilistic approach achieves superior segmentation performance, and outperforms a rule-based approach, while significantly reducing the time needed to deploy the segmenter for a new call center.

doi: 10.21437/Interspeech.2013-486

Cite as: Zinovieva, N., Zhuang, X., Peterson, P., Alwan, J., Prasad, R. (2013) Probabilistic trainable segmenter for call center audio using multiple features. Proc. Interspeech 2013, 2054-2058, doi: 10.21437/Interspeech.2013-486

  author={Nina Zinovieva and Xiaodan Zhuang and Pat Peterson and Joe Alwan and Rohit Prasad},
  title={{Probabilistic trainable segmenter for call center audio using multiple features}},
  booktitle={Proc. Interspeech 2013},