INTERSPEECH 2006 - ICSLP
In this paper we study the effectiveness of prosodic features for speaker verification. We hypothesize that prosody is linked to linguistic units such as syllables and prosodic features can be better represented with reference to the syllabic sequence. For extracting prosodic features, speech is segmented into syllable-like regions using the knowledge of vowel onset points (VOP). We use a technique based on excitation source information to detect VOPs automatically. The location of VOPs serve as reference for extracting prosodic features directly from speech signal. Various parameters are used to represent the pitch and energy dynamics of the region between two consecutive VOPs. The effectiveness of the derived prosodic features for speaker verification is demonstrated on NIST SRE 2003 extended data. The complementary nature of prosodic features and spectral features help to improve the accuracy of the combined speaker verification system.
Bibliographic reference. Mary, Leena / Yegnanarayana, B. (2006): "Prosodic features for speaker verification", In INTERSPEECH-2006, paper 1999-Tue1CaP.4.