4th International Conference on Spoken Language Processing

Philadelphia, PA, USA
October 3-6, 1996

Statistical Dialect Classification Based on Mean Phonetic Features

David R. Miller, James Trischitta

BBN Hark Systems, Cambridge, MA, USA

Our paper describes work done on a text-dependent method for automatic utterance classification and dialect model selection using mean cepstral and duration features on a per phoneme basis. From transcribed dialect data, we build a linear discriminant to separate the dialects in feature space. This method is potentially much faster than our previous selection algorithm. We have been able to achieve error rates of 8% for distinguishing Northern US speakers from Southern US speakers, and average error rates of 13% on a variety of finer pairwise dialect discriminations. We also present a description of the training and test corpora collected for this work.

Full Paper

Bibliographic reference.  Miller, David R. / Trischitta, James (1996): "Statistical dialect classification based on mean phonetic features", In ICSLP-1996, 2025-2027.