INTERSPEECH 2006 - ICSLP
In this study, we present two novel fusion approaches to merge subword and word based retrieval methods within a multilingual spoken document retrieval (SDR) system. Considering the fact that more than 6000 languages are spoken in the world today, resources (e.g., text and audio data, pronunciation lexicon) needed to develop Automatic Speech Recognition (ASR) systems for such a range of languages (accordingly the performances of these ASR systems) can be considered within a tiered structure. Even for resource-rich languages, some applications (e.g., historical digital archives) contain acoustical/lexical variations among time which presents challenges to build effective up-to-date audio indexing and retrieval systems. Within this concept, we focus on creating robust multilingual SDR systems employing both word-based and subword-based retrieval methods. Our proposed algorithms employ an OOV-word detection module to generate hybrid transcripts/ lattices. In our Dynamic Fusion (DF) approach, hybrid transcripts/lattices are used to assign dynamic fusion weights to each subsystem. In our Hybrid Fusion (HF) approach, queries are searched through hybrid lattices. We evaluated our proposed algorithms in a proper name retrieval task within the Spanish Broadcast News domain, and spoken document retrieval task using our historical speech archive NGSW corpus , where the proposed algorithms yield improvements over traditional fusion methods.
Bibliographic reference. Akbacak, Murat / Hansen, John H. L. (2006): "A robust fusion method for multilingual spoken document retrieval systems employing tiered resources", In INTERSPEECH-2006, paper 1835-Tue2CaP.8.