Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Multistage Convolutive Blind Source Separation for Speech Mixture

Yanxue Liang, Ichiro Hagiwara

Tokyo Institute of Technology, Japan

Blind source separation for convolutive mixture of speech signals has been addressed in many literatures. However, widely applied Multichannel Blind Deconvolution (MBD) method suffers whitening effect or arbitrary filtering problem which results in dramatic decrease of Automatic Speech Recognition system's performance. In present paper, a new MBD based multistage method is proposed, in which contributions of each source to every microphone are final goal rather than original signals. In detail, MBD is first implemented using entropy maximization criterion combined with Natural Gradient (NG) algorithm, then compensation matrix is constructed, based on which sources are recovered to its contribution to every microphone, i.e., whitening effect or arbitrary filtering problem has been transformed to fixed filtering problem. After compensation processing, for a certain source, it becomes Single Input and Multi-Output (SIMO) problem. 1098-Thus, not only spatial quality of source can be preserved, but also SIMO blind deconvolution can be further applied to fully recover temporal structure of speech signal. Finally, experiment shows validity and superiority over other methods in both spectra preservation efficiency and speed.

Full Paper

Bibliographic reference.  Liang, Yanxue / Hagiwara, Ichiro (2006): "Multistage convolutive blind source separation for speech mixture", In INTERSPEECH-2006, paper 1369-Thu2FoP.2.