NEC-TT Speaker Verification System for SRE’19 CTS Challenge

Kong Aik Lee, Koji Okabe, Hitoshi Yamamoto, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Keisuke Ishikawa, Koichi Shinoda


The series of speaker recognition evaluations (SREs) organized by the National Institute of Standards and Technology (NIST) is widely accepted as the de facto benchmark for speaker recognition technology. This paper describes the NEC-TT speaker verification system developed for the recent SRE’19 CTS Challenge. Our system is based on an x-vector embedding front-end followed by a thin scoring back-end. We trained a very-deep neural network for x-vector extraction by incorporating residual connections, squeeze-and-excitation networks, and angular-margin softmax at the output layer. We enhanced the back-end with a tandem approach leveraging the benefit of supervised and unsupervised domain adaptation. We obtained over 30% relative reduction in error rate with each of these enhancements at the front-end and back-end, respectively.


 DOI: 10.21437/Interspeech.2020-1132

Cite as: Lee, K.A., Okabe, K., Yamamoto, H., Wang, Q., Guo, L., Koshinaka, T., Zhang, J., Ishikawa, K., Shinoda, K. (2020) NEC-TT Speaker Verification System for SRE’19 CTS Challenge. Proc. Interspeech 2020, 2227-2231, DOI: 10.21437/Interspeech.2020-1132.


@inproceedings{Lee2020,
  author={Kong Aik Lee and Koji Okabe and Hitoshi Yamamoto and Qiongqiong Wang and Ling Guo and Takafumi Koshinaka and Jiacen Zhang and Keisuke Ishikawa and Koichi Shinoda},
  title={{NEC-TT Speaker Verification System for SRE’19 CTS Challenge}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={2227--2231},
  doi={10.21437/Interspeech.2020-1132},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1132}
}