13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

KNNDIST: A Non-Parametric Distance Measure for Speaker Segmentation

Seyed Hamidreza Mohammadi (1), Hossein Sameti (2), Mahsa Sadat Elyasi Langarani (2), Amirhossein Tavanaei (2)

(1) Center for Spoken Language Understanding, Oregon Health & Science University, Portland, OR, USA
(2) Speech Processing Laboratory, Sharif University of Technology, Tehran, Iran

A novel distance measure for distance-based speaker segmentation is proposed. This distance measure is non- parametric, in contrast to common distance measures used in speaker segmentation systems, which often assume a Gaussian distribution when measuring the distance between1two audio segments. This distance measure is essentially a k-nearest- neighbor distance measure. Non-vowel segment removal in pre- processing stage is also proposed. Speaker segmentation performance is tested on artificially created conversations from the TIMIT database and two AMI conversations. For short window lengths, Missed Detection Rated is decreased significantly. For moderate window lengths, a decrease in both Missed Detection and False Alarm Rates occur. The computational cost of the distance measure is high for long window lengths.

Index Terms: speaker segmentation, distance measure, k-nearest-neighbor

Full Paper

Bibliographic reference.  Mohammadi, Seyed Hamidreza / Sameti, Hossein / Langarani, Mahsa Sadat Elyasi / Tavanaei, Amirhossein (2012): "KNNDIST: a non-parametric distance measure for speaker segmentation", In INTERSPEECH-2012, 2282-2285.