14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Cross-Lingual Acoustic Model Adaptation Based on Transfer Vector Field Smoothing with MAP

Masahiro Saiko (1), Shigeki Matsuda (1), Ken Hanazawa (2), Ryosuke Isotani (2), Chiori Hori (1)

(1) NICT, Japan
(2) NEC Corporation, Japan

We propose a method to adapt acoustic models for robust speech recognition in real environments using data from other languages. In real-world speech recognition systems, we can effectively adapt acoustic models using the speech data logged by the system. However, when developing a system for a new language, this step is impossible since we have no such speech data for it. Assuming that similar Gaussians of each language have similar transfer vectors, in our proposed method, we estimate the transfer vectors of each Gaussian of the language for acoustic model adaptation by the transfer vectors of the other language. We evaluated the performance of Indonesian acoustic models that were adapted using the transfer vectors estimated from Japanese transfer vectors. Our proposed method achieved a relative error reduction rate of 10.6% for real environmental speech data.

Full Paper

Bibliographic reference.  Saiko, Masahiro / Matsuda, Shigeki / Hanazawa, Ken / Isotani, Ryosuke / Hori, Chiori (2013): "Cross-lingual acoustic model adaptation based on transfer vector field smoothing with MAP", In INTERSPEECH-2013, 3322-3326.