13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Sparse Banded Precision Matrices for Low Resource Speech Recognition

Weibin Zhang, Pascale Fung

HKUST, Human Language Technology Center, Department of Electronic and Computer Engineering, University of Science and Technology, Clear Water Bay, Hong Kong

We propose to use sparse banded precision matrices for speech recognition when there is insufficient training data. Previously we proposed a method to drive the structure of precision matrices to sparse under the HMM framework during training. The recognition accuracy of this compact model is shown to be better than full covariance or diagonal covariance systems. In this paper we propose to modify the penalization to automatically learn sparse banded precision matrices. This will drive the models trained even more compact. We demonstrate the importance of the order of features to the success of our proposed method. Using our proposed feature order, we can substantially reduce the right halfbandwidth of the sparse banded matrices without sacrificing the recognition accuracy. This saves memory and computation.

Index Terms: low resource speech recognition, sparse precision matrix, sparse banded precision matrix

Full Paper

Bibliographic reference.  Zhang, Weibin / Fung, Pascale (2012): "Sparse banded precision matrices for low resource speech recognition", In INTERSPEECH-2012, 1914-1917.