Many spectrum estimation methods and speech enhancement algorithms have previously been evaluated for noise-robust speaker identification (SID). However, these techniques have mostly been evaluated over artificially noised, mismatched training tasks with GMM-UBM speaker models. It is therefore unclear whether performance improvements observed with these methods translate to a broader range of noisy SID tasks. This study compares selected spectrum estimation methods from three classes: cochlear filterbanks, alternative time-domain windowing, and linear predictionbased techniques, as well as a set of frequency-domain noise reduction algorithms, across a suite of 8 evaluation tasks. The evaluation tasks are designed to expand upon the limited tasks addressed in past evaluations by exploring three research questions: performance on real noise versus artificial noise, performance on matched training tasks versus mismatched tasks, and performance when paired with an i-vector backend versus a GMM-UBM backend. We find that noise-robust spectrum estimation methods can improve the performance of SID systems over the range of noise tasks evaluated, including real noisy tasks, matched training tasks, and i-vector backends. However, performance on the typical GMM-UBM mismatched artificially noised case did not predict performance on other tasks. Finally, the matched enrollment case is a significantly different problem than the mismatched enrollment case.
Bibliographic reference. Godin, Keith W. / Sadjadi, Seyed Omid / Hansen, John H. L. (2013): "Impact of noise reduction and spectrum estimation on noise robust speaker identification", In INTERSPEECH-2013, 3656-3660.