14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Quantifying Cross-Linguistic Variation in Grapheme-to-Phoneme Mapping

Martine Coene (1), Annemiek Hammer (1), Wojtek Kowalczyk (2), Louis ten Bosch (3), Bart Vaerenberg (4), Paul J. Govaerts (4)

(1) Vrije Universiteit Amsterdam, The Netherlands
(2) Universiteit Leiden, The Netherlands
(3) Radboud Universiteit Nijmegen, The Netherlands
(4) Eargroup, Belgium

In the literature, languages have been identified as having more or less transparent orthographies, depending on the degree of predictability of their spelling-to-sound correspondences. Quantitative measures based on large-scaled language corpora which are capable to objectively assess such cross-linguistic variation are rather scarce. The quantitative assessment method presented here builds on the correlation between distances of phonemic and graphemic frequency distributions of a given sample and similar distances obtained from large corpora of the same language. The metric itself may be used as a research tool to investigate the potential effect of orthographic transparency on the development and performance of reading in different populations.

Full Paper

