X-Vector Singular Value Modification and Statistical-Based Decomposition with Ensemble Regression Modeling for Speaker Anonymization System

Candy Olivia Mawalim, Kasorn Galajit, Jessada Karnjana, Masashi Unoki


Anonymizing speaker individuality is crucial for ensuring voice privacy protection. In this paper, we propose a speaker individuality anonymization system that uses singular value modification and statistical-based decomposition on an x-vector with ensemble regression modeling. An anonymization system requires speaker-to-speaker correspondence (each speaker corresponds to a pseudo-speaker), which may be possible by modifying significant x-vector elements. The significant elements were determined by singular value decomposition and variant analysis. Subsequently, the anonymization process was performed by an ensemble regression model trained using x-vector pools with clustering-based pseudo-targets. The results demonstrated that our proposed anonymization system effectively improves objective verifiability, especially in anonymized trials and anonymized enrollments setting, by preserving similar intelligibility scores with the baseline system introduced in the VoicePrivacy 2020 Challenge.


 DOI: 10.21437/Interspeech.2020-1887

Cite as: Mawalim, C.O., Galajit, K., Karnjana, J., Unoki, M. (2020) X-Vector Singular Value Modification and Statistical-Based Decomposition with Ensemble Regression Modeling for Speaker Anonymization System. Proc. Interspeech 2020, 1703-1707, DOI: 10.21437/Interspeech.2020-1887.


@inproceedings{Mawalim2020,
  author={Candy Olivia Mawalim and Kasorn Galajit and Jessada Karnjana and Masashi Unoki},
  title={{X-Vector Singular Value Modification and Statistical-Based Decomposition with Ensemble Regression Modeling for Speaker Anonymization System}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={1703--1707},
  doi={10.21437/Interspeech.2020-1887},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1887}
}