Second European Conference on Speech Communication and Technology

Genova, Italy
September 24-26, 1991


Speaker Independent Word Recognition using HMMs with an Orthogonalized Phonetic Segment Codebook

Tsuneo Nitta, Jun'ichi Iwasaki, Hiroshi Matsu'ura

Information Systems Engineering Laboratory, TOSHIBA Corp. , Saiwai-ku, Kawasaki, Japan

The large matrix quantization (MQ) distortion becomes a problem as a spectrum-time pattern in MQ have many dimensions and wide variation. In this paper, we introduce a multiple phonological unit called the phonetic segment for a unit of MQ and apply a statistical matrix quantization (SMQ). The SMQ effectively incorporates pattern variations of each phonetic segment into an orthogonalized phonetic segment codebook. We also propose a simple SMQ-HMM training algorithm called an Equally Counted K-best Learning in which each phonetic event observed within the best K is equally counted in a model and output probabilities are smoothed without fuzzy rule. The proposed method has been tested on a 100-word vocabulary data set uttered by 10 unknown speakers, using a real time recognition system, and has achieved the high performance of 96. 0%.

Full Paper

Bibliographic reference.  Nitta, Tsuneo / Iwasaki, Jun'ichi / Matsu'ura, Hiroshi (1991): "Speaker independent word recognition using HMMs with an orthogonalized phonetic segment codebook", In EUROSPEECH-1991, 1107-1110.