In order to investigate the influence of individual acoustic features in the human process of speaker identification, the authors propose a new model to predict the perceptual contribution rate of each acoustic feature. The perceptual contribution rate is measured by a hearing test using speech resynthesized by swapping the acoustic features of natural speech, while its predicted value is calculated using a model based on differences in the acoustic features, that is, a cepstral distance representing spectral information and a difference of the mean logarithmic voice fundamental frequency representing information of the voice fundamental frequency. It is demonstrated that contribution rates can be predicted with an error of 7.71% in RMS after optimization of the acoustic features' weighting factors.
Bibliographic reference. Higuchi, Norio / Hashimoto, Makoto (1995): "Analysis of acoustic features affecting speaker identification", In EUROSPEECH-1995, 435-438.