Understanding the Effect of Voice Quality and Accent on Talker Similarity

Anurag Das, Guanlong Zhao, John Levis, Evgeny Chukharev-Hudilainen, Ricardo Gutierrez-Osuna

This paper presents a methodology to study the role of non-native accents on talker recognition by humans. The methodology combines a state-of-the-art accent-conversion system to resynthesize the voice of a speaker with a different accent of her/his own, and a protocol for perceptual listening tests to measure the relative contribution of accent and voice quality on speaker similarity. Using a corpus of non-native and native speakers, we generated accent conversions in two different directions: non-native speakers with native accents, and native speakers with non-native accents. Then, we asked listeners to rate the similarity between 50 pairs of real or synthesized speakers. Using a linear mixed effects model, we find that (for our corpus) the effect of voice quality is five times as large as that of non-native accent, and that the effect goes away when speakers share the same (native) accent. We discuss the potential significance of this work in earwitness identification and sociophonetics.

 DOI: 10.21437/Interspeech.2020-2910

Cite as: Das, A., Zhao, G., Levis, J., Chukharev-Hudilainen, E., Gutierrez-Osuna, R. (2020) Understanding the Effect of Voice Quality and Accent on Talker Similarity. Proc. Interspeech 2020, 1763-1767, DOI: 10.21437/Interspeech.2020-2910.

  author={Anurag Das and Guanlong Zhao and John Levis and Evgeny Chukharev-Hudilainen and Ricardo Gutierrez-Osuna},
  title={{Understanding the Effect of Voice Quality and Accent on Talker Similarity}},
  booktitle={Proc. Interspeech 2020},