INTERSPEECH 2006 - ICSLP
Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Distance Measure Between Gaussian Distributions for Discriminating Speaking Styles

Goshu Nagino, Makoto Shozakai

Asahi Kasei Corporation, Japan

Discriminating speaking styles is an important issue in speech recognition, speaker recognition and speaker segmentation. This paper compares distance measures between Gaussian distributions for discriminating speaking styles. The Mahalanobis distance, the Bhattacharyya distance and the Kullback-Leibler divergence, which are in common use for a definition as a distance measure between Gaussian distributions, are evaluated in terms of an accuracy to discriminate speaking styles. In this paper, the accuracy is judged on a visualized map, where speaking style speech corpora are mapped onto twodimensional space by utilizing a multidimensional scaling method. It is shown that speaking style clusters appear clearly grouped on the visualized map obtained by the Bhattacharyya distance and the Kullback-Leibler divergence. In addition, the visualized map corresponds to speech recognition performance, and the Kullback-Leibler shows higher sensitivity to recognition performance.

Full Paper

Bibliographic reference.  Nagino, Goshu / Shozakai, Makoto (2006): "Distance measure between Gaussian distributions for discriminating speaking styles", In INTERSPEECH-2006, paper 1383-Mon3CaP.6.