INTERSPEECH 2006 - ICSLP
Automatic language identification (LID) decisions are made based on scores of language models (LM). In our previous paper , we have shown that replacing n-gram LMs with SVMs significantly improved performance of both the PPRLM and GMM-tokenization-based LID systems when tested on the OGI-TS corpus. However, the relatively small corpus size may limit the general applicability of the findings. In this paper, we extend the SVM-based approach on the larger CallFriend corpus evaluated using the NIST 1996 and 2003 evaluation sets. With more data, we found that SVM is still better than n-gram models. In addition, back-end processing is useful with SVM scores in Call-Friend which differs from our observation in the OGI-TS corpus. By combining the SVM-based GMM and phonotactic systems, our LID system attains an ID error of 12.1% on NIST 2003 evaluation set which is more than 4% (25% relatively) better than the baseline n-gram system.
Bibliographic reference. Yang, Xi / Zhai, Lu-Feng / Siu, Manhung / Gish, Herbert (2006): "Improved language identification using support vector machines for language modeling", In INTERSPEECH-2006, paper 1450-Mon2CaP.5.