13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Initialization Schemes for Multilayer Perceptron Training and their Impact on ASR Performance using Multilingual Data

Ngoc Thang Vu (1), Wojtek Breiter (1), Florian Metze (2), Tanja Schultz (1)

(1) Cognitive Systems Lab, Institute for Anthropomatics, Karlsruhe Institute of Technology (KIT), Germany
(2) Language Technologies Institute, Carnegie Mellon University (CMU), Pittsburgh, PA, USA

In this paper we present our latest investigation on initialization schemes for Multilayer Perceptron (MLP) training using multilingual data. We show that the overall performance of an MLP network improves significantly by initializing it with a multilingual MLP. We propose a new strategy called "open target language" MLP to train more flexible models for language adaptation, which is particularly suited for small amounts of training data. Furthermore, by applying Bottle-Neck feature (BN) initialized with multilingual MLP the ASR performance increases on both, on those languages which were used for multilingual MLP training, and on a new language. Our experiments show word error rate improvements of up to 16.9% relative on a range of tasks for different target languages (Creole and Vietnamese) with manually and automatically transcribed training data.

Index Terms: multilingual multilayer perceptron, Bottle-Neck feature, language adaptation

Full Paper

Bibliographic reference.  Vu, Ngoc Thang / Breiter, Wojtek / Metze, Florian / Schultz, Tanja (2012): "Initialization schemes for multilayer perceptron training and their impact on ASR performance using multilingual data", In INTERSPEECH-2012, 2586-2589.