Speech Prosody 2010
Chicago, IL, USA
This paper is devoted to modeling prosody of whispered Russian speech. The practical purpose of this research is to extend voice cloning techniques to whispered speech modality. The authors present their analysis of prosodic features that contribute to the expression of sentence type intonation in whispered speech. The current investigation includes intonation contours in complete and incomplete declaratives, as well as in interrogatives and exclamations. Since the fundamental frequency is absent in whisper, the major role in conveying sentence type intonation is taken over by formant values. For modeling prosody of whispered speech, an extension of the Accent Unit Portrait Model is proposed. The paper demonstrates how melodic, rhythmic and dynamic (energy) portraits of accent units can be built and employed for whispered speech modifications by a concatenative text-to-speech synthesizer.
Index Terms: whispered speech, prosody modeling, speech synthesis, accent unit portrait model, formant modification.
Bibliographic reference. Petrushin, Valery A. / Tsirulnik, Liliya I. / Makarova, Veronika (2010): "Whispered speech prosody modeling for TTS synthesis", In SP-2010, paper 288.