Minimum Sample Length for the Estimation of Long-term Speaking Rate

Pablo Arantes, Anders Eriksson, VerĂ´nica Lima


In this study, we expand on previous experiments designed with the aim of determining the minimum length that an audio sample should have in order for the speaking rate derived from it to be representative of the sample as a whole. We compare two different approaches to establishing that the time series of the cumulative speaking rate calculated over the audio sample has reached stability. We also compare the effect on stabilization time of four other factors that may affect the way speaking rate is calculated. The results show that all factors tested have significant effects, although of limited practical concern. Overall, average stability time is 12.1 seconds, with the bulk of the distribution lying between 7.9 and 16.2 s.


 DOI: 10.21437/SpeechProsody.2018-134

Cite as: Arantes, P., Eriksson, A., Lima, V. (2018) Minimum Sample Length for the Estimation of Long-term Speaking Rate. Proc. 9th International Conference on Speech Prosody 2018, 661-665, DOI: 10.21437/SpeechProsody.2018-134.


@inproceedings{Arantes2018,
  author={Pablo Arantes and Anders Eriksson and VerĂ´nica Lima},
  title={Minimum Sample Length for the Estimation of Long-term Speaking Rate},
  year=2018,
  booktitle={Proc. 9th International Conference on Speech Prosody 2018},
  pages={661--665},
  doi={10.21437/SpeechProsody.2018-134},
  url={http://dx.doi.org/10.21437/SpeechProsody.2018-134}
}