13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Text-dependent Pathological Voice Detection

Gopala Krishna Anumanchipalli (1,2), Hugo Meinedo (2), Miguel Bugalho (2), Isabel Trancoso (2), Luís C. Oliveira (2), Alan W. Black (1)

(1) Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, USA
(2) Spoken Language Systems Laboratory, INESC-ID/IST Lisboa, Portugal

This work presents some features exploiting the underlying text in detection of pathological voices. While global characteristics of the speaker's source and spectral features have been successfully employed in pathological voice detection, the underlying text has largely been ignored. In this work, we focus on experiments that exploit the text stimulus that is read by the subject. Features derived from text include the mean cepstral distortion of the subject from an average intelligible speaker, and prosodic features include the speaking rate, statistics of phoneme durations etc. The phonetic labelling information is also exploited to ignore all the unvoiced regions of the speech samples to improve the discriminability between intelligible and pathological voices. We also design features that capture the speaker's overall closeness to intelligible instances of the same text stimulus from other speakers. Our experiments show that the proposed text-derived features improve the detection of pathological voices by 20%.

Index Terms: Pathological voices, example based detection, text-driven features, fusion of classification methods

Full Paper

Bibliographic reference.  Anumanchipalli, Gopala Krishna / Meinedo, Hugo / Bugalho, Miguel / Trancoso, Isabel / Oliveira, Luís C. / Black, Alan W. (2012): "Text-dependent pathological voice detection", In INTERSPEECH-2012, 530-533.