Prosodic Comparison of Utterances without Extracting Fundamental Frequencies based on Vocalized Subharmonic Summation

Takuya Ozuru, Nobuaki Minematsu, Daisuke Saito


In classes of language learning and actors’ training, learners and trainees often compare their utterances prosodically with those from a model speaker. They want to know similarity of the pitch movement in their utterances to that in model utterances. In this paper, to automate prosodic comparison, a classical but highly useful algorithm to compare F0 of a reference sound and that of an input sound is examined for prosodic utterance comparison. The algorithm is SubHarmonic Summation (SHS) and it is widely used for instrumental sounds and singing voices. In this paper, since both a reference stream and an input stream are vocal utterances, a modified algorithm of SHS is proposed and tested experimentally. It is interesting that the proposed method can calculate similarity in terms of pitch movement between the two utterances without extracting fundamental frequencies. Theoretical foundation and experimental verifications of the proposed method are presented. In experiments, it is shown that the method can detect speech segments produced with inadequate prosodic control better than the classical SHS, even without extracting fundamental frequencies.


 DOI: 10.21437/SpeechProsody.2018-35

Cite as: Ozuru, T., Minematsu, N., Saito, D. (2018) Prosodic Comparison of Utterances without Extracting Fundamental Frequencies based on Vocalized Subharmonic Summation. Proc. 9th International Conference on Speech Prosody 2018, 172-176, DOI: 10.21437/SpeechProsody.2018-35.


@inproceedings{Ozuru2018,
  author={Takuya Ozuru and Nobuaki Minematsu and Daisuke Saito},
  title={Prosodic Comparison of Utterances without Extracting Fundamental Frequencies based on Vocalized Subharmonic Summation},
  year=2018,
  booktitle={Proc. 9th International Conference on Speech Prosody 2018},
  pages={172--176},
  doi={10.21437/SpeechProsody.2018-35},
  url={http://dx.doi.org/10.21437/SpeechProsody.2018-35}
}