Estimation of Hidden Speaking Rate

Guan-Ting Liou, Chen-Yu Chiang, Yih-Ru Wang, Sin-Horng Chen


Hidden speaking rate is proposed in this paper. In contrast to traditional raw speaking rate estimation that simply averages number of syllable or phone per second with or without pauses, the proposed hidden speaking rate is estimated by normalizing effects of lexical information and prosodic structure based on the existing speaking rate-dependent hierarchical prosodic model (SR-HPM). The significance of the proposed hidden speaking rate is exemplified by analysis on the speaking rate estimation for a Mandarin speech database containing four parallel speech corpora of a female professional announcer with fast, normal, medium and slow speaking rates. By conducting prosody generation experiment on the same speech corpus, the hidden speaking rate is proved to be more meaningful and accurate to represent speaker’s intended/underlying speaking rate than conventional raw speaking rate.


 DOI: 10.21437/SpeechProsody.2018-120

Cite as: Liou, G., Chiang, C., Wang, Y., Chen, S. (2018) Estimation of Hidden Speaking Rate. Proc. 9th International Conference on Speech Prosody 2018, 592-596, DOI: 10.21437/SpeechProsody.2018-120.


@inproceedings{Liou2018,
  author={Guan-Ting Liou and Chen-Yu Chiang and Yih-Ru Wang and Sin-Horng Chen},
  title={Estimation of Hidden Speaking Rate},
  year=2018,
  booktitle={Proc. 9th International Conference on Speech Prosody 2018},
  pages={592--596},
  doi={10.21437/SpeechProsody.2018-120},
  url={http://dx.doi.org/10.21437/SpeechProsody.2018-120}
}