The spectral parameters for spontaneous speech recognition need to more clearly emphasize the relevant spectral dynamics of speech sounds. This paper discusses the dynamic cepstra, which is a spectral representation simulating the time-frequency characteristics of auditory forward masking. In this paper, we propose a new dynamic cepstrum that incorporates both forward and backward masking to improve robustness against noise and mismatch utterance style. The dynamic characteristics of signals at a time point is efficiently extracted using the information from both sides of a time axis. The new dynamic cepstrum expresses the dynamic features of speech more efficiently than the dynamic cepstrum incorporating only one direction of masking with the same masking duration. The new dynamic cepstrum improved the phoneme recognition error using a phonetic typewriter over 10% for both of two utterance styles; 1) the same as and 2) different from the utterance style of the training data base.
Bibliographic reference. Beppu, Tomohiko / Aikawa, Kiyoaki (1995): "Spontaneous speech recognition using dynamic CEPSTRA incorporating forward and backward masking effect", In EUROSPEECH-1995, 511-514.