A 43 Language Multilingual Punctuation Prediction Neural Network Model

Xinxing Li, Edward Lin


Punctuation prediction is a critical component for speech recognition readability and speech translation segmentation. When considering multiple language support, traditional monolingual neural network models used for punctuation prediction can be costly to manage and may not produce the best accuracy. In this paper, we investigate multilingual Long Short-Term Memory (LSTM) modeling using Byte Pair Encoding (BPE) for punctuation prediction to support 43 languages1 across 69 countries. Our findings show a single multilingual BPE-based model can achieve similar or even better performance than separate monolingual word-based models by benefiting from shared information across different languages. On an in-domain news text test set, the multilingual model achieves on average 80.2% F1-score while on out-of-domain speech recognition text, it achieves 73.5% F1-score. We also show that the shared information can help in fine-tuning for low-resource languages as well.


 DOI: 10.21437/Interspeech.2020-2052

Cite as: Li, X., Lin, E. (2020) A 43 Language Multilingual Punctuation Prediction Neural Network Model. Proc. Interspeech 2020, 1067-1071, DOI: 10.21437/Interspeech.2020-2052.


@inproceedings{Li2020,
  author={Xinxing Li and Edward Lin},
  title={{A 43 Language Multilingual Punctuation Prediction Neural Network Model}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={1067--1071},
  doi={10.21437/Interspeech.2020-2052},
  url={http://dx.doi.org/10.21437/Interspeech.2020-2052}
}