13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Detecting Intelligibility by Linear Dimensionality Reduction and Normalized Voice Quality Hierarchical Features

Dong-Yan Huang, Yongwei Zhu, Dajun Wu, Rongshan Yu

Department of Signal Processing. Institute for Infocomm Research/A*STAR, Singapore

Voice disorders could increase unhealthy social behavior and voice abuse, and dramatically affect the patients' quality of life. Therefore, automatic intelligibility detection of pathological voices has an important role in the opportune treatment of pathological voices. This paper aims at designing an intelligibility detection system which is characterized by two aspects. First, the system is based on features inspired from voice pathology such as voice quality features, spectral and harmonicity features, and hierarchical features. Second, the intelligibility detection is based on fusion of linear dimensionality reduction such as asymmetric sparse PLS trained by different sets of normalized features. An optimal unweighted recall performance is 71.88% on the test set, an improvement of 2.28% absolute (3.28% relative) over the baseline model accuracy of 69.60%.

Index Terms: Intelligibility detection, voice quality, hierarchical features, dimensionality reduction

Full Paper

Bibliographic reference.  Huang, Dong-Yan / Zhu, Yongwei / Wu, Dajun / Yu, Rongshan (2012): "Detecting intelligibility by linear dimensionality reduction and normalized voice quality hierarchical features", In INTERSPEECH-2012, 546-549.