Speech Prosody 2006

Dresden, Germany
May 2-5, 2006

F0 Characteristics of Yes-No Question Intonation in Arabic and English: Disambiguation Techniques for Use in ASR

Leslie Barrett (1), Kazue Hata (2)

(1) EDGAR Online, Inc., New York, NY, USA; (2) Santa Barbara, California, USA

This paper presents preliminary research into the possibility of using +F0 (fundamental frequency) information to enhance the performance of speech-to-speech translation engines and speech recognition software for Arabic and English. Specifically, we aim to find factors that differentiate yes-no question in both languages from other sentential types. Although previous research using cross-linguistic question data has shown F0 rise to be the main indicator of yes-no questions, the particular F0 characteristics used by listeners as perceptual cues varied. Using comparative language data, the aim of this study was to find reliable question indicators that could be detected by automated means. In an experiment with short sentences read by a native speaker of each language, we examined aspects of F0 contours in the two languages to find reliable recognition thresholds. Results indicate that reliable indicators of yes-no questions do exist for both languages and occur within the sentence-final 50 centiseconds.

Full Paper

Bibliographic reference.  Barrett, Leslie / Hata, Kazue (2006): "F0 characteristics of yes-no question intonation in Arabic and English: disambiguation techniques for use in ASR", In SP-2006, paper 047.