First Workshop on Speech, Language and Audio in Multimedia (SLAM 2013)
We describe QCompere consortium submissions to the REPERE 2013 evaluation campaign. The REPERE challenge aims at gathering four communities (face recognition, speaker identification, optical character recognition and named entity detection) towards the same goal: multimodal person recognition in TV broadcast. First, four mono-modal components are introduced (one for each foregoing community) constituting the elementary building blocks of our various submissions. Then, depending on the target modality (speaker or face recognition) and on the task (supervised or unsupervised recognition), four different fusion techniques are introduced: they can be summarized as propagation-, classifier-, rule- or graph-based approaches. Finally, their performance is evaluated on REPERE 2013 test set and their advantages and limitations are discussed.
Index Terms: speaker identification, face recognition, named entity detection, video optical character recognition, multimodal fusion
Bibliographic reference. Bredin, Hervé / Poignant, Johann / Fortier, Guillaume / Tapaswi, Makarand / Le, Viet-Bac / Roy, Anindya / Barras, Claude / Rosset, Sophie / Sarkar, Achintya / Yang, Qian / Gao, Hua / Mignon, Alexis / Verbeek, Jakob / Besacier, Laurent / Quénot, Georges / Ekenel, Hazim Kemal / Stiefelhagen, Rainer (2013): "QCompere @ REPERE 2013", In SLAM-2013, 49-54.