Second International Conference on Spoken Language Processing (ICSLP'92)

Banff, Alberta, Canada
October 13-16, 1992

Alternative Preprocessing Techniques for Discrete Hidden Markov Model Phoneme Recognition

Andrew Tridgell (1), Bruce Millar (1), Kim-Anh Do (2)

(1) Computer Sciences Laboratory, Research School of Physical Sciences and Engineering; (2) Statistical Sciences Division, Centre for Mathematics and Its Applications; Australian National University, Canberra, A.C.T, Australia

In this paper a number of alternative pre-processing configurations are applied to an HMM-based phoneme recognition system and evaluated on the TIMIT speech corpus. It is demonstrated that there is considerable advantage in the addition of processing steps after the initial signal processing. F-ratio analysis gives a clear ranking of the discriminatory power of commonly used features such as log-power, zero-crossing rate, cepstral, delta cepstral and band-power coefficients. Results have been obtained that demonstrate a 20% reduction in the mis-classification rate using a linear discriminant analysis transformation from a 43-variable feature set to a 10-variable linearly transformed feature set. Finally the paper demonstrates that vector quantisation using totally non-parametric classification trees can lead to phoneme classification results competitive with those achieved using traditional techniques, while at the same time offering much faster evaluation.

Full Paper

Bibliographic reference.  Tridgell, Andrew / Millar, Bruce / Do, Kim-Anh (1992): "Alternative preprocessing techniques for discrete hidden Markov model phoneme recognition", In ICSLP-1992, 631-634.