Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Spontaneous Speech Recognition Using Dynamic CEPSTRA Incorporating Forward and Backward Masking Effect

Tomohiko Beppu (1), Kiyoaki Aikawa (2)

(1) ATR Interpreting Telecommunications Research Labs. Soraku-Gun, Kyoto, Japan
(2) ATR Human Information Proc. Research Labs. Soraku-Gun, Kyoto, Japan

The spectral parameters for spontaneous speech recognition need to more clearly emphasize the relevant spectral dynamics of speech sounds. This paper discusses the dynamic cepstra, which is a spectral representation simulating the time-frequency characteristics of auditory forward masking. In this paper, we propose a new dynamic cepstrum that incorporates both forward and backward masking to improve robustness against noise and mismatch utterance style. The dynamic characteristics of signals at a time point is efficiently extracted using the information from both sides of a time axis. The new dynamic cepstrum expresses the dynamic features of speech more efficiently than the dynamic cepstrum incorporating only one direction of masking with the same masking duration. The new dynamic cepstrum improved the phoneme recognition error using a phonetic typewriter over 10% for both of two utterance styles; 1) the same as and 2) different from the utterance style of the training data base.

Full Paper

Bibliographic reference.  Beppu, Tomohiko / Aikawa, Kiyoaki (1995): "Spontaneous speech recognition using dynamic CEPSTRA incorporating forward and backward masking effect", In EUROSPEECH-1995, 511-514.