EUROSPEECH 2001 Scandinavia
7th European Conference on Speech Communication and Technology

Aalborg, Denmark
September 3-7, 2001


Improving Simultaneous Speech Recognition in Real Room Environments Using Overdetermined Blind Source Separation

Athanasios Koutras, Evangelos Dermatas, George Kokkinakis

University of Patras, Greece

In this paper we present a novel solution to the Overdetermined Blind Speech Separation (OBSS) problem for improving speech recognition accuracy of N simultaneous speakers in real room environments using M (M>N) microphones. The proposed OBSS system uses basic NxN Blind Speech Separation networks that process in parallel all different combinations of the available mixture signals in the frequency domain, resulting to lower computational complexity and faster convergence. Extensive experiments using an array of two to ten microphones and two simultaneous speakers in a simulated real room, showed that when the number of the microphones increases beyond two, the separation performance is improved and the phoneme recognition accuracy of an HMM based decoder increases drastically (more than 6%). Therefore, the introduction of more microphones than speakers is justified in order to improve speech recognition accuracy in multi simultaneous speaker environments.

Full Paper

Bibliographic reference.  Koutras, Athanasios / Dermatas, Evangelos / Kokkinakis, George (2001): "Improving simultaneous speech recognition in real room environments using overdetermined blind source separation", In EUROSPEECH-2001, 1009-1012.