Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

On the Fusion of Prosody, Voice Spectrum and Face Features for Multimodal Person Verification

M. Farrús, A. Garde, P. Ejarque, J. Luque, Javier Hernando

Universitat Politècnica de Catalunya, Spain

Multimodal person recognition systems normally use short-term spectral features as voice information. In this paper prosodic information is added to a system based on face and voice spectrum features. By using two fusion techniques, support vector machines and matcher weighting, different fusion strategies based on the fusion of monomodal scores in several steps are proposed. The performance of the system is clearly improved when the prosodic information is added and the best results are achieved when prosodic scores are previously fused and the resulting scores are fused again with spectral and facial scores. Speech and face scores have been obtained upon Switchboard-I and XM2VTS databases respectively.

Full Paper

Bibliographic reference.  Farrús, M. / Garde, A. / Ejarque, P. / Luque, J. / Hernando, Javier (2006): "On the fusion of prosody, voice spectrum and face features for multimodal person verification", In INTERSPEECH-2006, paper 1256-Wed3CaP.8.