Hypernasality Severity Detection Using Constant Q Cepstral Coefficients

Akhilesh Kumar Dubey, S.R. Mahadeva Prasanna, S. Dandapat

In this work, detection of hypernasality severity in cleft palate speech is attempted using constant Q cepstral coefficients (CQCC) feature. The coupling of nasal tract with the oral tract during the production of hypernasal speech adds nasal formants and anti-formants in low frequency region of vowel spectrum mainly around the first formant. The strength and position of nasal formants and anti-formants along with the oral formants changes as the severity of nasality changes in hypernasal speech. The CQCC feature is extracted from the constant Q transform (CQT) spectrum which employs geometrically spaced frequency bins and maintains a constant Q factor for across the entire spectrum. This results in a higher frequency resolution at lower frequencies and higher temporal resolution at higher frequencies. The CQT spectrum resolves the nasal and oral formants in low frequency and captures the spectral changes due to change in nasality severity. The CQCC feature gives the overall classification accuracy of 83.33% and 78.47% for /i/ and /u/ vowels corresponding to normal, mild and moderate-severe hypernasal speech, respectively using multiclass support vector classifier.

 DOI: 10.21437/Interspeech.2019-2151

Cite as: Dubey, A.K., Prasanna, S.M., Dandapat, S. (2019) Hypernasality Severity Detection Using Constant Q Cepstral Coefficients. Proc. Interspeech 2019, 4554-4558, DOI: 10.21437/Interspeech.2019-2151.

  author={Akhilesh Kumar Dubey and S.R. Mahadeva Prasanna and S. Dandapat},
  title={{Hypernasality Severity Detection Using Constant Q Cepstral Coefficients}},
  booktitle={Proc. Interspeech 2019},