Second European Conference on Speech Communication and Technology

Genova, Italy
September 24-26, 1991


Influence of Field Data in HMM Training for a Vocal Server

Dominique Morin

Prosodie Informatique, Paris, France CNET, Speech Communication Department, LAA/TSS/RCP, Lannion, France

Our task is to improve speech recognition in large scale, general public applications over the telephone network, known as Voice Response Systems (vrs). It has been observed that the performance of speech recognition systems decreases as they are put in service, compared to the rates obtained with laboratory data. This paper studies the contribution of field data (i. e. data extracted from the VRS in operation) in the training of a speaker-independent isolated-word speech recognition system. The experiments are conducted with a database of about 20,000 tokens, half field and half laboratory data, a vocabulary of 21 words, and HMM modelling with 3 types of modelling units (fixed or variable length word, and allophones). The results show that models trained on a mixture of field and laboratory data perform 30% better than models trained exclusively on laboratory data.

Full Paper

Bibliographic reference.  Morin, Dominique (1991): "Influence of field data in HMM training for a vocal server", In EUROSPEECH-1991, 735-738.