Using State of the Art Speaker Recognition and Natural Language Processing Technologies to Detect Alzheimer’s Disease and Assess its Severity

Raghavendra Pappagari, Jaejin Cho, Laureano Moro-Velázquez, Najim Dehak


In this study, we analyze the use of state-of-the-art technologies for speaker recognition and natural language processing to detect Alzheimer’s Disease (AD) and to assess its severity predicting Mini-mental status evaluation (MMSE) scores. With these purposes, we study the use of speech signals and transcriptions. Our work focuses on the adaptation of state-of-the-art models for both modalities individually and together to examine its complementarity. We used x-vectors to characterize speech signals and pre-trained BERT models to process human transcriptions with different back-ends in AD diagnosis and assessment. We evaluated features based on silence segments of the audio files as a complement to x-vectors. We trained and evaluated our systems in the Interspeech 2020 ADReSS challenge dataset, containing 78 AD patients and 78 sex and age-matched controls. Our results indicate that the fusion of scores obtained from the acoustic and the transcript-based models provides the best detection and assessment results, suggesting that individual models for two modalities contain complementary information. The addition of the silence-related features improved the fusion system even further. A separate analysis of the models suggests that transcript-based models provide better results than acoustic models in the detection task but similar results in the MMSE prediction task.


 DOI: 10.21437/Interspeech.2020-2587

Cite as: Pappagari, R., Cho, J., Moro-Velázquez, L., Dehak, N. (2020) Using State of the Art Speaker Recognition and Natural Language Processing Technologies to Detect Alzheimer’s Disease and Assess its Severity. Proc. Interspeech 2020, 2177-2181, DOI: 10.21437/Interspeech.2020-2587.


@inproceedings{Pappagari2020,
  author={Raghavendra Pappagari and Jaejin Cho and Laureano Moro-Velázquez and Najim Dehak},
  title={{Using State of the Art Speaker Recognition and Natural Language Processing Technologies to Detect Alzheimer’s Disease and Assess its Severity}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={2177--2181},
  doi={10.21437/Interspeech.2020-2587},
  url={http://dx.doi.org/10.21437/Interspeech.2020-2587}
}