Speech Prosody 2010

Chicago, IL, USA
May 10-14, 2010

Acoustic, Electroglottographic and Paralinguistic Analyses of “Rikimi” in Expressive Speech

Carlos T. Ishi, Hiroshi Ishiguro, Norihiro Hagita

Intelligent Robotics and Communication Labs., ATR, Kyoto, Japan

“Rikimi” is a “pressed-type” voice quality that appears in Japanese conversational speech for expressing paralinguistic information related to emotional or attitudinal behaviors of the speaker. We conducted acoustic, electroglottographic (EGG) and paralinguistic analyses on speech segments including “rikimi”, extracted from spontaneous dialogue speech data. “Rikimi” may be accompanied by several voice qualities (such as creaky or harsh), but vocal fold vibratory pattern analyses based on the EGG signals indicated that a common feature was found in the relation between overall open and closed intervals, in comparison to non-“rikimi” segments. Spectral analyses show that parameters related with spectral tilt are effective to identify part of the “rikimi” segments, but fail when vowels are nasalized. F0 contour analysis showed that a dip occurs during “rikimi” segments, but a change in voice quality is prominently perceived rather than a change in the intonational curve. Linguistic contents are also found to influence the perception of “rikimi” in the conveyance of paralinguistic information.

Index Terms: pressed voice, voice quality, EGG, expressive speech, prosody

Full Paper

Bibliographic reference.  Ishi, Carlos T. / Ishiguro, Hiroshi / Hagita, Norihiro (2010): "Acoustic, electroglottographic and paralinguistic analyses of “rikimi” in expressive speech", In SP-2010, paper 139.