13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

A Initial Attempt on Task-Specific Adaptation for Deep Neural Network-based Large Vocabulary Continuous Speech Recognition

Yeming Xiao, Zhen Zhang, Shang Cai, Jielin Pan, Yonghong Yan

Key Laboratory of Speech Acoustics and Content Understanding, Chinese Academy of Sciences, Beijing, China

In the state-of-the-art automatic speech recognition (ASR) systems, adaption techniques are used to the mitigate performance degradation caused by the mismatch in the training and testing procedure. Although there are bunch of adaption techniques for the hidden Markov models (HMM)-GMM-based system, there is rare work about the adaption in the hybrid artificial neural network~(ANN)/HMM-based system. Recently, there is a resurgence on ANN/HMM scheme for ASR with the success of context dependent deep neural network HMM~(CD-DNN/ HMM). Therefore in this paper, we present our initial efforts on the adaption techniques in the CD-DNN/HMM system. Specially, a linear input network(LIN)-based method and a neural network retraining(NNR)-based method is experimentally explored for the the task-adaptation purpose. Experiments on conversation telephone speech data set shows that these techniques can improve the system significantly and LINbased method seems to work better with medium mount of adaptation data.

Index Terms: deep neural network, pre-training, speaker adaptation, LVCSR

Full Paper

Bibliographic reference.  Xiao, Yeming / Zhang, Zhen / Cai, Shang / Pan, Jielin / Yan, Yonghong (2012): "A initial attempt on task-specific adaptation for deep neural network-based large vocabulary continuous speech recognition", In INTERSPEECH-2012, 2574-2577.