13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Confidence for Speaker Diarization using PCA Spectral Ratio

Orith Toledo-Ronen, Hagai Aronowitz

IBM Research – Haifa, Haifa University Mount Carmel, Haifa, Israel

Confidence scoring is an important component in speaker diarization systems, both for offline speech analytics and for online diarization that are require to produce the speaker segmentation from very little audio. This paper proposes a confidence measure for speaker diarization based on the spectral ratio of the eigenvalues of the Principal Component Analysis (PCA) transformation computed on the pre-segmented audio before diarization is performed on the conversation. We tested our method on two-speaker data and our results show the effectiveness of the PCA's spectral ratio confidence measure for both offline and online diarization. We compare and contrast our proposed confidence measure with other clustering validation methods that provide a quantitative measure of the segmentation quality but are calculated on the segmented data after diarization is performed, and with a related approach that extracts a confidence from the PCA of the pre-segmented audio.

Index Terms: speaker diarization, principle component analysis, confidence measure

Full Paper

Bibliographic reference.  Toledo-Ronen, Orith / Aronowitz, Hagai (2012): "Confidence for speaker diarization using PCA spectral ratio", In INTERSPEECH-2012, 2162-2165.