Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
In this paper a number of alternative pre-processing configurations are applied to an HMM-based phoneme recognition system and evaluated on the TIMIT speech corpus. It is demonstrated that there is considerable advantage in the addition of processing steps after the initial signal processing. F-ratio analysis gives a clear ranking of the discriminatory power of commonly used features such as log-power, zero-crossing rate, cepstral, delta cepstral and band-power coefficients. Results have been obtained that demonstrate a 20% reduction in the mis-classification rate using a linear discriminant analysis transformation from a 43-variable feature set to a 10-variable linearly transformed feature set. Finally the paper demonstrates that vector quantisation using totally non-parametric classification trees can lead to phoneme classification results competitive with those achieved using traditional techniques, while at the same time offering much faster evaluation.
Bibliographic reference. Tridgell, Andrew / Millar, Bruce / Do, Kim-Anh (1992): "Alternative preprocessing techniques for discrete hidden Markov model phoneme recognition", In ICSLP-1992, 631-634.