Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Improving Tone Recognition with Combined Frequency and Amplitude Modelling

Siwei Wang, Gina-Anne Levow

University of Chicago, USA

To improve tone recognition in continuous speech, we propose a strategy focusing on separating regions influenced by tonal coarticulation from regions that more closely approximate canonical tone production. Given a syllable segmentation, this approach employs amplitude and pitch information to generate an improved sub-syllable segmentation and feature representation. This sub-syllable segmentation is derived from the convex hull of the amplitude-pitch plot. Our approach achieves a 15% improvement using our segmentation strategy over a simple time-only segmentation. Finally, a future extension with sequential labelling is discussed.

Full Paper

Bibliographic reference.  Wang, Siwei / Levow, Gina-Anne (2006): "Improving tone recognition with combined frequency and amplitude modelling", In INTERSPEECH-2006, paper 1651-Thu1FoP.8.