INTERSPEECH 2004 - ICSLP
This paper presents a novel solution to the problem of isolated digit recognition in background music. A Factorial Hidden Markov Model (FHMM) architecture is proposed to accurately model the simultaneous occurrence of two independent processes, such as an utterance of a digit and an excerpt of music. The FHMM is implemented with its equivalent HMM by extending Nadas' MIXMAX algorithm to a mixture of Gaussians PDF. At around 0 dB SNR, the proposed system shows an average relative reduction in word error rate of 57% in the recognition of isolated digits in background music.
Bibliographic reference. Hasegawa-Johnson, Mark / Deoras, Ameya (2004): "A factorial HMM aproach to robust isolated digit recognition in background music", In INTERSPEECH-2004, 2093-2096.