EUROSPEECH 2001 Scandinavia
Usually, speaker recognition systems do not take into account the dependence between the vocal source and the vocal tract. A feasibility study that retains this dependence is presented here. A model of joint probability functions of the pitch and the feature vectors is proposed. Three strategies are designed and compared for all female speakers taken from the SPIDRE corpus. The first operates on all voiced and unvoiced speech segments (baseline strategy). The second strategy considers only the voiced speech segments and the last includes the pitch information along with thestandard MFCC. We use two pattern recognizers: LVQ--SLP and GMM. In all cases, we observe an increase in the identification rates and more specifically when using a time duration of 500ms (6% higher).
Bibliographic reference. Ezzaidi, Hassan / Rouat, Jean / O’Shaughnessy, Douglas (2001): "Towards combining pitch and MFCC for speaker recognition systems", In EUROSPEECH-2001, 2825-2828.