Third International Conference on Spoken Language Processing (ICSLP 94)
This paper presents an adaptation method for telephone line characteristic variations in speech recognition across telephone lines. From the viewpoint of real world application, it is important to adapt several variability factors simultaneously, such factors include speakers, line characteristics, and telephone microphones. Our approach is to adapt models of high-quality speech to those of speech influenced by telephone line characteristics using the Vector Field Smoothing (VFS) technique as the adaptation method. The general framework of this technique is the training with a small amount of data. In this paper, the VFS technique is shown to be an effective method for simultaneously adapting speaker, line, and telephone microphone characteristics. Through some experiments in a speaker-independent recognizer using telephone-quality speech data collected through an actual telephone line, it was found that the VFS technique performed extremely well in the simultaneous adaptation of speaker and line characteristics. Furthermore, we introduce preliminary experimental results for telephone line characteristic adaptation using an advanced adaptation method, a combination of Maximum A Posteriori (MAP) estimation and the VFS technique. This method aims to achieve faster adaptation which will be helpful in achieving a comfortable human-machine interface in telephone network applications. Encouraging results were obtained in several experiments.
Bibliographic reference. Takahashi, Jun-ichi / Sagayama, Shigeki (1994): "Telephone line characteristic adaptation using vector field smoothing technique", In ICSLP-1994, 991-994.