STC-Innovation Speaker Recognition Systems for Far-Field Speaker Verification Challenge 2020

Aleksei Gusev, Vladimir Volokhov, Alisa Vinogradova, Tseren Andzhukaev, Andrey Shulipa, Sergey Novoselov, Timur Pekhovsky, Alexander Kozlov


This paper presents speaker recognition (SR) systems submitted by the Speech Technology Center (STC) team to the Far-Field Speaker Verification Challenge 2020. SR tasks of the challenge are focused on the problem of far-field text-dependent speaker verification from single microphone array (Track 1), far-field text-independent speaker verification from single microphone array (Track 2) and far-field text-dependent speaker verification from distributed microphone arrays (Track 3).

In this paper, we present techniques and ideas underlying our best performing models. A number of experiments on x-vector-based and ResNet-like architectures show that ResNet-based networks outperform x-vector-based systems. Submitted systems are the fusions of ResNet34-based extractors, trained on 80 Log Mel-filter bank energies (MFBs) post-processed with U-net-like voice activity detector (VAD). The best systems for the Track 1, Track 2 and Track 3 achieved 5.08% EER and 0.500 Cmindet, 5.39% EER and 0.541 Cmindet and 5.53% EER and 0.458 Cmindet on the challenge evaluation sets respectively.


 DOI: 10.21437/Interspeech.2020-2580

Cite as: Gusev, A., Volokhov, V., Vinogradova, A., Andzhukaev, T., Shulipa, A., Novoselov, S., Pekhovsky, T., Kozlov, A. (2020) STC-Innovation Speaker Recognition Systems for Far-Field Speaker Verification Challenge 2020. Proc. Interspeech 2020, 3466-3470, DOI: 10.21437/Interspeech.2020-2580.


@inproceedings{Gusev2020,
  author={Aleksei Gusev and Vladimir Volokhov and Alisa Vinogradova and Tseren Andzhukaev and Andrey Shulipa and Sergey Novoselov and Timur Pekhovsky and Alexander Kozlov},
  title={{STC-Innovation Speaker Recognition Systems for Far-Field Speaker Verification Challenge 2020}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={3466--3470},
  doi={10.21437/Interspeech.2020-2580},
  url={http://dx.doi.org/10.21437/Interspeech.2020-2580}
}