Predicting Severity of Voice Disorder from DNN-HMM Acoustic Posteriors

Tan Lee, Yuanyuan Liu, Yu Ting Yeung, Thomas K.T. Law, Kathy Y.S. Lee

Acoustical analysis of speech is considered a favorable and promising approach to objective assessment of voice disorders. Previous research emphasized on the extraction and classification of voice quality features from sustained vowel sounds. In this paper, an investigation on voice assessment using continuous speech utterances of Cantonese is presented. A DNN-HMM based speech recognition system is trained with speech data of unimpaired voice. The recognition accuracy for pathological utterances is found to decrease significantly with the disorder severity increasing. Average acoustic posterior probabilities are computed for individual phones from the speech recognition output lattices and the DNN soft-max layer. The phone posteriors obtained for continuous speech from the mild, moderate and severe categories are highly distinctive and thus useful to the determination of voice disorder severity. A subset of Cantonese phonemes are identified to be suitable and reliable for voice assessment with continuous speech.

DOI: 10.21437/Interspeech.2016-1098

Cite as

Lee, T., Liu, Y., Yeung, Y.T., Law, T.K., Lee, K.Y. (2016) Predicting Severity of Voice Disorder from DNN-HMM Acoustic Posteriors. Proc. Interspeech 2016, 97-101.

author={Tan Lee and Yuanyuan Liu and Yu Ting Yeung and Thomas K.T. Law and Kathy Y.S. Lee},
title={Predicting Severity of Voice Disorder from DNN-HMM Acoustic Posteriors},
booktitle={Interspeech 2016},