8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

A Factorial HMM Aproach to Robust Isolated Digit Recognition in Background Music

Mark Hasegawa-Johnson, Ameya Deoras

University of Illinois at Urbana-Champaign, USA

This paper presents a novel solution to the problem of isolated digit recognition in background music. A Factorial Hidden Markov Model (FHMM) architecture is proposed to accurately model the simultaneous occurrence of two independent processes, such as an utterance of a digit and an excerpt of music. The FHMM is implemented with its equivalent HMM by extending Nadas' MIXMAX algorithm to a mixture of Gaussians PDF. At around 0 dB SNR, the proposed system shows an average relative reduction in word error rate of 57% in the recognition of isolated digits in background music.

Full Paper

Bibliographic reference.  Hasegawa-Johnson, Mark / Deoras, Ameya (2004): "A factorial HMM aproach to robust isolated digit recognition in background music", In INTERSPEECH-2004, 2093-2096.