In our previous work [1, 2], we trained an HMM-based speech recognizer without transcription or any knowledge or resources. The trained HMM recognizer was used to transcribe audio into self-organized units (SOUs) and we evaluated its performance on the task of topic identification. In this paper, we report our work in applying SOUs to discover audio patterns in spoken documents without supervision. By recognizing audio into SOUs which are sound-like units, the discovery for common audio patterns can be carried out extremely efficiently over a large corpus, without dynamic programming comparisons as proposed by earlier work . Experiments were performed on Mandarin conversational telephone speech using both the one-best SOU token sequences and SOU consensus networks. We show that using SOU as keys to audio patterns, we can discover frequently spoken words with good purity.
Bibliographic reference. Siu, Man-hung / Gish, Herbert / Lowe, Steve / Chan, Arthur (2011): "Unsupervised audio patterns discovery using HMM-based self-organized units", In INTERSPEECH-2011, 2333-2336.