5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Cepstral-Time Matrices and LDA for Improved Connected Digit and Sub-Word Recognition Accuracy

Ben Milner

Speech Technology Unit, BT Laboratories, Martlesham Heath, Suffolk, UK

Previous work has shown that good accuracy improvements can be made for isolated word recognition using cepstral-time matrices as the speech feature instead of the more conventional MFCC-based speech feature augmented with higher order cepstrum. This work extends the performance improvements to UK English connected digit strings and to a sub-word based town names task. Experimental results are presented for a range different sized cepstral-time matrix widths - ranging from a stack width of 3 up to 13 MFCC frames. In addition a variety of columns are selected from the cepstral-time matrix for use as the final speech feature. Tests show that the optimal implementation of the cepstral-time matrix varies according to the specific recognition task. Finally the technique of linear discriminative analysis (LDA) is applied to cepstral-time matrices and is shown to successfully improve recognition performance, as well as reducing the size of the final speech feature. Three different implementations of LDA are described and are demonstrated on isolated digit and sub- word tasks.

Full Paper

Bibliographic reference.  Milner, Ben (1997): "Cepstral-time matrices and LDA for improved connected digit and sub-word recognition accuracy", In EUROSPEECH-1997, 405-408.