Speech Prosody 2004

Nara, Japan
March 23-26, 2004

HMM-Based Speech Synthesis with Various Speaking Styles Using Model Interpolation

Makoto Tachibana, Junichi Yamagishi, Koji Onishi, Takashi Masuko, Takao Kobayashi

Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, Yokohama, Japan

This paper presents an approach to realizing various speaking styles and emotional expressions using a model interpolation technique in HMM-based speech synthesis. In the approach, we synthesize speech with an intermediate speaking style between representative speaking styles from a model obtained by interpolating representative style models. We chose three styles, "reading," "joyful," and "sad," as representative styles, and synthesized speech from models obtained by interpolating two models for every combination of two styles. From a result of a subjective similarity evaluation, it is shown that speech generated from an interpolated model has a speaking style in between two representative speaking styles.

Full Paper

Bibliographic reference.  Tachibana, Makoto / Yamagishi, Junichi / Onishi, Koji / Masuko, Takashi / Kobayashi, Takao (2004): "HMM-based speech synthesis with various speaking styles using model interpolation", In SP-2004, 413-416.