Multidimensional scaling of systems in the Voice Conversion Challenge 2016

Mirjam Wester, Zhizheng Wu, Junichi Yamagishi


This study investigates how listeners judge the similarity of voice converted voices using a talker discrimination task. The data used is from the Voice Conversion Challenge 2016. 17 participants from around the world took part in building voice converted voices from a shared data set of source and target speakers. This paper describes the evaluation of similarity for four of the source-target pairs (two intra-gender and two cross-gender) in more detail. Multidimensional scaling was performed to illustrate where each system was perceived to be in an acoustic space compared to the source and target speakers and to each other.


DOI: 10.21437/SSW.2016-7

Cite as

Wester, M., Wu, Z., Yamagishi, J. (2016) Multidimensional scaling of systems in the Voice Conversion Challenge 2016. Proc. 9th ISCA Speech Synthesis Workshop, 38-43.

Bibtex
@inproceedings{Wester+2016,
author={Mirjam Wester and Zhizheng Wu and Junichi Yamagishi},
title={Multidimensional scaling of systems in the Voice Conversion Challenge 2016},
year=2016,
booktitle={9th ISCA Speech Synthesis Workshop},
doi={10.21437/SSW.2016-7},
url={http://dx.doi.org/10.21437/SSW.2016-7},
pages={38--43}
}