Analysis and modeling of between-sentence pauses in news speech by Japanese newscasters

Shizuka Nakamura, Carlos Toshinori Ishi, Tatsuya Kawahara


Many speech synthesizers hardly consider between-sentence pauses. This could be one of the factors of the monotony of continuous synthesized speech. Aiming at breaking the monotony and improving the news speech likeness, we analyzed the characteristics of between-sentence pause durations of news speech by two newscasters and constructed a model to predict these durations. Analysis of the pause durations firstly revealed that the difference in the distributions between the two newscasters are largely affected by pauses after lead sentences, which have a large freedom. Then, from prosodic context analysis, it became clear that the following prosodic features have a correlation with between-sentence pause durations: the average F0 of the last part in the preceding sentence, and the number of morae included in the succeeding sentence. The correlation coefficient between the predicted values by a linear multiple regression model using these parameters and the measured values was 0.44 for the test data. It was found that between-sentence pause durations could be predicted to some extent by utilizing prosodic information of the preceding and succeeding speech features. The news speech likeness of continuous synthesized speech can be improved by incorporating this model into existing speech synthesizers which generate speech sentence by sentence.


 DOI: 10.21437/SpeechProsody.2020-139

Cite as: Nakamura, S., Ishi, C.T., Kawahara, T. (2020) Analysis and modeling of between-sentence pauses in news speech by Japanese newscasters. Proc. 10th International Conference on Speech Prosody 2020, 680-684, DOI: 10.21437/SpeechProsody.2020-139.


@inproceedings{Nakamura2020,
  author={Shizuka Nakamura and Carlos Toshinori Ishi and Tatsuya Kawahara},
  title={{Analysis and modeling of between-sentence pauses in news speech by Japanese newscasters}},
  year=2020,
  booktitle={Proc. 10th International Conference on Speech Prosody 2020},
  pages={680--684},
  doi={10.21437/SpeechProsody.2020-139},
  url={http://dx.doi.org/10.21437/SpeechProsody.2020-139}
}