Speech Prosody 2008

Campinas, Brazil
May 6-9, 2008

Detecting Non-Modal Phonation in Telephone Speech

Tae-Jin Yoon (1), Jennifer Cole (2), Mark Hasegawa-Johnson (3)

(1) Department of Linguistics, University of Victoria, Canada
(2) Department of Linguistics, University of Illinois at Urbana-Champaign, USA
(3) Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, USA

Non-modal phonation conveys both linguistic and paralinguistic information, and is distinguished by acoustic source and filter features. Detecting non-modal phonation in speech requires reliable F0 analysis, a problem for telephone-band speech, where F0 analysis frequently fails. We demonstrate an approach to the detection of creaky phonation in telephone speech based on robust F0 and spectral analysis. Our F0 analysis relies on an autocorrelation algorithm applied to the intensity-boosted and inverse-filtered speech signal and succeeds in regions of nonmodal phonation where the non-filtered F0 analysis typically fails. In addition to the extracted F0 values, spectral amplitude is measured at the first two harmonics (H1, H2) and the first three formants (A1, A2, A3). Visual and spectral inspection of the detected creaky phonation confirms the findings reported from laboratory setting. Statistical analysis using oneway ANOVA and classification using Support Vector Machine (SVM) reveals promising results which lead to further improvement for automatic detection of non-modal phonation in telephone speech.

Full Paper

Bibliographic reference.  Yoon, Tae-Jin / Cole, Jennifer / Hasegawa-Johnson, Mark (2008): "Detecting non-modal phonation in telephone speech", In SP-2008, 33-36.