Speech Prosody 2012
This paper describes a tool designed to allow linguists to manipulate the prosody of an utterance via a symbolic representation in order to evaluate linguistic models. Prosody is manipulated via a Praat TextGrid which allows the user to modify the rhythm and melody. Rhythm is manipulated by factoring segmental duration into three components: (i) intrinsic duration determined by phonemic identity (ii) local modifications encoded on the rhythm tier and (iii) global variations of speech rate encoded on the intonation tier. Melody is similarly determined by tonal segments on the tonal tier (= pitch accents) and on the intonation tier (= boundary tones) together with global parameters of key and span determining changes of pitch register. The TextGrid is used to generate a Manipulation object which can be used either for immediate interactive assessment of the prosody determined by the annotation, or to generate synthesised stimuli for more formal perceptual experiments.
Index Terms: speech synthesis, speech prosody, analysis by synthesis, linguistic models, rhythm, melody
Bibliographic reference. Hirst, Daniel (2012): "Prozed: a speech prosody analysis-by-synthesis tool for linguists", In SP-2012, 15-18.