We tested the influence of fundamental oscillation (fo) on human and machine speaker recognition performance in vocalic test utterances. In experiment I, we trained a Gaussian-Mixture model on 15 speakers (80 multi-word utterances each) and tested it with sustained vowel utterances (/a:/, /i:/ and /u:/) under six fo conditions, three changing (fall, rise, fall-rise) and three steady-state (high, mid, low). Results revealed better performance for the steady-state compared to the changing conditions and within the steady-state condition, performance was poorest for high fo. In experiment II, we tested 9 human listeners on a subset of 4 speakers from experiment I. They went through two training tasks (training 1: multi-word utterances; training 2: words). In the test, they recognized speakers based on the same vocalic utterances as in experiment I (for these 4 speakers). Results showed that performance was about equally high for the changing and steady-state vowels, however, in the steady-state condition performance was best for high fo vowels. The experiments suggest that (a) fo has an influence on the strength of speaker specific characteristics in vowels and (b) humans - compared to machines - pay attention to different acoustic information in vocalic utterances for speaker recognition.
DOI: 10.21437/Interspeech.2018-2331
Cite as: Dellwo, V., Kathiresan, T., Pellegrino, E., He, L., Schwab, S., Maurer, D. (2018) Influences of Fundamental Oscillation on Speaker Identification in Vocalic Utterances by Humans and Computers. Proc. Interspeech 2018, 3795-3799, DOI: 10.21437/Interspeech.2018-2331.
@inproceedings{Dellwo2018, author={Volker Dellwo and Thayabaran Kathiresan and Elisa Pellegrino and Lei He and Sandra Schwab and Dieter Maurer}, title={Influences of Fundamental Oscillation on Speaker Identification in Vocalic Utterances by Humans and Computers}, year=2018, booktitle={Proc. Interspeech 2018}, pages={3795--3799}, doi={10.21437/Interspeech.2018-2331}, url={http://dx.doi.org/10.21437/Interspeech.2018-2331} }