4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
On-line speaker adaptation is desirable for speech recognition dictation applications, because it offers the possibility to improve the system with the speaker-specific data obtained from the user. Since the user will work with such a device over a long period, for a dictation system the long term adaptation performance is more important than the adaptation speed. In contrast to speaker-dependent re-training, the speaker-specific speech data does not need to be stored for on-line speaker adaptation and each adaptation step does not require a large computational effort. In this paper we describe our way of performing online Bayesian speaker adaptation using partial traceback. We compare supervised with unsupervised adaptation and speaker adaptation with speaker-dependent training using the adaptation material. Compared to the speaker-independent startup models, the error rate was divided by two after five hours of supervised adaptation in our experiments. In the long term experiments, supervised on-line adaptation performed similar to speaker-dependent training using the adaptation material.
Bibliographic reference. Thelen, Eric (1996): "Long term on-line speaker adaptation for large vocabulary dictation", In ICSLP-1996, 2139-2142.