The TalTech Systems for the Short-Duration Speaker Verification Challenge 2020

Tanel Alumäe, Jörgen Valk


This paper presents the Tallinn University of Technology systems submitted to the Short-duration Speaker Verification Challenge 2020. The challenge consists of two tasks, focusing on text-dependent and text-independent speaker verification with some cross-lingual aspects. We used speaker embedding models that consist of squeeze-and-attention based residual layers, multi-head attention and either cross-entropy-based or additive angular margin based objective function. In order to encourage the model to produce language-independent embeddings, we trained the models in a multi-task manner, using dataset specific output layers. In the text-dependent task we employed a phrase classifier to reject trials with non-matching phrases. In the text-independent task we used a language classifier to boost the scores of trials where the language of the test and enrollment utterances does not match. Our final primary metric score was 0.075 in Task 1 (ranked as 6th) and 0.118 in Task 2 (rank 8).


 DOI: 10.21437/Interspeech.2020-2233

Cite as: Alumäe, T., Valk, J. (2020) The TalTech Systems for the Short-Duration Speaker Verification Challenge 2020. Proc. Interspeech 2020, 746-750, DOI: 10.21437/Interspeech.2020-2233.


@inproceedings{Alumäe2020,
  author={Tanel Alumäe and Jörgen Valk},
  title={{The TalTech Systems for the Short-Duration Speaker Verification Challenge 2020}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={746--750},
  doi={10.21437/Interspeech.2020-2233},
  url={http://dx.doi.org/10.21437/Interspeech.2020-2233}
}