Speech Prosody 2010

Chicago, IL, USA
May 10-14, 2010

Capturing Inter-speaker Invariance Using Statistical Measures of Rhythm

Tae-Jin Yoon

Department of Linguistics and Languages, McMaster University, Canada

Statistical rhythmic metrics are applied on a Buckeye corpus [1] of spontaneous interview speech in order to investigate the extent of rhythm variability of between-speakers as well as the variability of within-speaker. The corpus consists of speech produced by speakers who share the same regional dialect in North America. The Buckeye corpus is unique in that the speech dataset is obtained from the speakers who have been raised in the same region and hence who share the same dialect from each other. Statistical measures of rhythm metrics are obtained from each of 10 speakers. The results show that the rhythmic measures that capture the least dialectal variance is the normalized pair-wise variability indices calculated based on adjacent consonantal duration and vocalic duration. The finding implies that these statistical measures of rhythm can be used in capturing the dialectal similarities. Index Terms: speech rhythm, Buckeye corpus, rhythm metrics, rhythmic variability of between-speakers, rhythmic variability of within-speaker


  1. Pitt, M.A., Dilley, L., Johnson, K., Kiesling, S., Raymond, W., Hume, E. and Fosler-Lussier, E. (2007) Buckeye Corpus of Conversational Speech (2nd rel.) [www.buckeyecorpus.osu.edu] Columbus, OH: Department of Psychology, Ohio State University (Distributor).

Full Paper

Bibliographic reference.  Yoon, Tae-Jin (2010): "Capturing inter-speaker invariance using statistical measures of rhythm", In SP-2010, paper 201.