The Althingi ASR System

Inga R. Helgadóttir, Anna Björk Nikulásdóttir, Michal Borský, Judy Y. Fong, Róbert Kjaran, Jón Guðnason

All performed speeches in the Icelandic parliament, Althingi, are transcribed and published. An automatic speech recognition system (ASR) has been developed to reduce the manual work involved. To our knowledge, this is the first open source speech recognizer in use for Icelandic. In this paper the development of the ASR is described. In-lab system performance is evaluated and first results from the users are described. A word error rate (WER) of 7.91% was obtained on our in-lab speech recognition test set using time-delay deep neural network (TDNN) and re-scoring with a bidirectional recurrent neural network language model (RNN-LM). No further processing of the text is included in that number. In-lab F-score for the punctuation model is 80.6 and 61.6 for the paragraph model. The WER of the ASR, including punctuation marks and other post-processing, was 15.0 ± 6.0%, over 625 speeches, when tested in the wild. This is an upper limit since not all mismatches with the reference text are true errors of the ASR. The transcribers of Althingi graded 77% of the speech transcripts as Good. The Althingi corpus and ASR recipe, constitute a valuable resource for further developments within Icelandic language technology.

 DOI: 10.21437/Interspeech.2019-1248

Cite as: Helgadóttir, I.R., Nikulásdóttir, A.B., Borský, M., Fong, J.Y., Kjaran, R., Guðnason, J. (2019) The Althingi ASR System. Proc. Interspeech 2019, 3013-3017, DOI: 10.21437/Interspeech.2019-1248.

  author={Inga R. Helgadóttir and Anna Björk Nikulásdóttir and Michal Borský and Judy Y. Fong and Róbert Kjaran and Jón Guðnason},
  title={{The Althingi ASR System}},
  booktitle={Proc. Interspeech 2019},