DNN Bottleneck Features for Speaker Clustering

Jesús Jorrín, Paola García, Luis Buera


In this work, we explore deep neural network bottleneck features (BNF) in the context of speaker clustering. A straightforward manner to deal with speaker clustering is to reuse the bottleneck features extracted from speaker recognition. However, the selection of a bottleneck architecture or nonlinearity impacts the performance of both systems. In this work, we analyze the bottleneck features obtained for speaker recognition and test them in a speaker clustering scenario. We observe that there are deep neural network topologies that work better for both cases, even when their classification criteria (senone classification) is loosely met. We present results that outperform a traditional MFCC system by 21% for speaker recognition and between 20% and 37% in clustering using the same topology.


 DOI: 10.21437/Interspeech.2017-144

Cite as: Jorrín, J., García, P., Buera, L. (2017) DNN Bottleneck Features for Speaker Clustering. Proc. Interspeech 2017, 1024-1028, DOI: 10.21437/Interspeech.2017-144.


@inproceedings{Jorrín2017,
  author={Jesús Jorrín and Paola García and Luis Buera},
  title={DNN Bottleneck Features for Speaker Clustering},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={1024--1028},
  doi={10.21437/Interspeech.2017-144},
  url={http://dx.doi.org/10.21437/Interspeech.2017-144}
}