Pitch-Adaptive Front-end Feature for Hypernasality Detection

Akhilesh Kumar Dubey, S R Mahadeva Prasanna, S Dandapat

Hypernasality in cleft palate (CP) children is due to the velopharyngeal insufficiency. The vowels get nasalized in hypernasal speech and the nasality evidence are mainly present in low-frequency region around the first formant (F1) of vowels. The detection of hypernasality using Mel-frequency cepstral coefficient (MFCC) feature may get affected because the feature might not be able to capture the nasality evidence present in the low-frequency region. This is due to the fact that the MFCC feature extracted from high pitched children speech contains the pitch harmonics effect of magnitude spectrum. The pitch harmonics effect results in high variance for the higher dimensions of MFCC coefficients. This problem may increase due to high perturbation in pitch of CP speech. So in this work, a pitch-adaptive MFCC feature is used for hypernasality detection. The feature is derived from the cepstral smooth spectrum instead of magnitude spectrum. A pitch-adaptive low time liftering is done to smooth out the pitch harmonics. This feature when used for the detection of hypernasality using support vector machine (SVM) gives an accuracy of 83.45%, 88.04% and 85.58% for /a/, /i/ and /u/ vowels respectively, which is better than the accuracy of MFCC feature.

 DOI: 10.21437/Interspeech.2018-1251

Cite as: Dubey, A.K., Prasanna, S.R.M., Dandapat, S. (2018) Pitch-Adaptive Front-end Feature for Hypernasality Detection. Proc. Interspeech 2018, 372-376, DOI: 10.21437/Interspeech.2018-1251.

  author={Akhilesh Kumar Dubey and S R Mahadeva Prasanna and S Dandapat},
  title={Pitch-Adaptive Front-end Feature for Hypernasality Detection},
  booktitle={Proc. Interspeech 2018},