First International Conference on Spoken Language Processing (ICSLP 90)

Kobe, Japan
November 18-22, 1990

Speaker Weighted Training of HMM Using Multiple Reference Speakers

Hiroaki Hattori, Satoshi Nakamura, Kiyohiro Shikano, Shigeki Sagayama

ATR Interpreting Telephony Research Laboratories, Kyoto, Japan

This paper proposes a new speaker adaptation method using speaker weights for multiple reference speaker training. The speaker weights are calculated to reflect the similarity of each reference speaker's dynamic features to an input speaker. They are used to have the similarities affect to hidden Markov models. The evaluation experiments are carried out through the /b,d,g,m,n,N/ phoneme recognition task using 8 speakers. Average recognition rates are 68.0%, 66.4%, and 65.6% respectively for three test sets which have different speech styles, that is, word utterances, phrase-by-phrase utterances and continuous utterances. These are 1.6%, 6.7%, and 8.2% respectively higher than the supplemented HMM rates.

Full Paper

Bibliographic reference.  Hattori, Hiroaki / Nakamura, Satoshi / Shikano, Kiyohiro / Sagayama, Shigeki (1990): "Speaker weighted training of HMM using multiple reference speakers", In ICSLP-1990, 149-152.