ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Balancing word lists in speech audiometry through large spoken language corpora

Annemiek Hammer, Bart Vaerenberg, Wojtek Kowalczyk, Louis ten Bosch, Martine Coene, Paul J. Govaerts

This paper describes a distance measure which estimates the distance between a language sample and a reference corpus with regard to graphemes, phonemes and the relation between them. The underlying assumption of this approach is that a languageĀfs phoneme distribution can be partially accessed via graphemes. The advantage of using such a measure in speech audiometry is twofold: (i) it may be applied to determine how representative existing word lists are with respect to the distribution of speech sounds in the target language of the test subject; (ii) it enables the audiologist to generate highly representative lists based on large corpora of languages for which broad phonetic transcription is lacking. In this paper the development of the de novo distance measure is described and demonstrated for Dutch. The technique itself however, is language-independent and has been applied successfully to 10 other EU-languages. As such, it paves the way to generating representative word lists as part of speech audiometric test batteries for any given language.

doi: 10.21437/Interspeech.2013-318

Cite as: Hammer, A., Vaerenberg, B., Kowalczyk, W., Bosch, L.t., Coene, M., Govaerts, P.J. (2013) Balancing word lists in speech audiometry through large spoken language corpora. Proc. Interspeech 2013, 3613-3616, doi: 10.21437/Interspeech.2013-318

  author={Annemiek Hammer and Bart Vaerenberg and Wojtek Kowalczyk and Louis ten Bosch and Martine Coene and Paul J. Govaerts},
  title={{Balancing word lists in speech audiometry through large spoken language corpora}},
  booktitle={Proc. Interspeech 2013},