Automatic Speech Recognition for ILSE-Interviews: Longitudinal Conversational Speech Recordings Covering Aging and Cognitive Decline

Ayimunishagu Abulimiti, Jochen Weiner, Tanja Schultz


The Interdisciplinary Longitudinal Study on Adult Development and Aging (ILSE) was initiated with the aim to investigate satisfying and healthy aging. Over 20 years, about 4200 hours of biographic interviews from more than 1,000 participants were recorded. Spoken language is a strong indicator for declining cognitive resources, as it is affected in early stage. Hence, various research topics related to aging like dementia, could be analyzed based on data such as the ILSE interviews. The analysis of language capabilities requires transcribed speech. Since manual transcriptions are time and cost consuming, we aim to automatically transcribing the ILSE data using Automatic Speech Recognition (ASR). The recognition of ILSE interviews is very demanding due to the combination of various challenges: 20 year old analog two-speaker one-channel recordings of low signal quality, emotional and personal interviews between doctor and participant, and repeated recordings of aging, partly fragile individuals. In this study, we describe ongoing work to develop hybrid Hidden Markov Model (HMM)- Deep Neural Network (DNN) based ASR system for the ILSE corpus. So far, the best ASR system is obtained by second-pass decoding of a hybrid HMM-DNN model using recurrent neural network based language models with a word error rate of 50.39%.


 DOI: 10.21437/Interspeech.2020-2829

Cite as: Abulimiti, A., Weiner, J., Schultz, T. (2020) Automatic Speech Recognition for ILSE-Interviews: Longitudinal Conversational Speech Recordings Covering Aging and Cognitive Decline. Proc. Interspeech 2020, 3795-3799, DOI: 10.21437/Interspeech.2020-2829.


@inproceedings{Abulimiti2020,
  author={Ayimunishagu Abulimiti and Jochen Weiner and Tanja Schultz},
  title={{Automatic Speech Recognition for ILSE-Interviews: Longitudinal Conversational Speech Recordings Covering Aging and Cognitive Decline}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={3795--3799},
  doi={10.21437/Interspeech.2020-2829},
  url={http://dx.doi.org/10.21437/Interspeech.2020-2829}
}