THUEE System for NIST SRE19 CTS Challenge

Ruyun Li, Tianyu Liang, Dandan Song, Yi Liu, Yangcheng Wu, Can Xu, Peng Ouyang, Xianwei Zhang, Xianhong Chen, Wei-Qiang Zhang, Shouyi Yin, Liang He

In this paper, we present the system that THUEE submitted to NIST 2019 Speaker Recognition Evaluation CTS Challenge (SRE19). Similar to the previous SREs, domain mismatches, such as cross-lingual and cross-channel between the training sets and evaluation sets, remain the major challenges in this evaluation. To improve the robustness of our systems, we develop deeper and wider x-vector architectures. Besides, we use novel speaker discriminative embedding systems, hybrid multi-task learning architectures combined with phonetic information. To deal with domain mismatches, we follow a heuristic search scheme to select the best back-end strategy based on limited development corpus. An extended and factorized TDNN achieves the best single-system results on SRE18 DEV and SRE19 EVAL sets. The final system is a fusion of six subsystems, which yields EER 2.81% and minimum cost 0.262 on the SRE19 EVAL set.

 DOI: 10.21437/Interspeech.2020-1245

Cite as: Li, R., Liang, T., Song, D., Liu, Y., Wu, Y., Xu, C., Ouyang, P., Zhang, X., Chen, X., Zhang, W., Yin, S., He, L. (2020) THUEE System for NIST SRE19 CTS Challenge. Proc. Interspeech 2020, 2232-2236, DOI: 10.21437/Interspeech.2020-1245.

  author={Ruyun Li and Tianyu Liang and Dandan Song and Yi Liu and Yangcheng Wu and Can Xu and Peng Ouyang and Xianwei Zhang and Xianhong Chen and Wei-Qiang Zhang and Shouyi Yin and Liang He},
  title={{THUEE System for NIST SRE19 CTS Challenge}},
  booktitle={Proc. Interspeech 2020},