14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

A Style Control Technique for Singing Voice Synthesis Based on Multiple-Regression HSMM

Takashi Nose, Misa Kanemoto, Tomoki Koriyama, Takao Kobayashi

Tokyo Institute of Technology, Japan

This paper proposes a technique for controlling singing style in the HMM-based singing voice synthesis. A style control technique based on multiple regression HSMM (MRHSMM), which was originally proposed for the HMM-based expressive speech synthesis, is applied to the conventional technique. The idea of pitch adaptive training is introduced into the MRHSMM to improve the modeling accuracy of fundamental frequency (F0) associated with notes. A robust vibrato modeling technique based on a moving average filter is also proposed to reproduce a natural-sounding vibrato expression even when the vibrato expression of the original singing voice is unclear. Subjective evaluation results show that users can intuitively control a singing style while keeping naturalness of the synthetic voice.

Full Paper

Bibliographic reference.  Nose, Takashi / Kanemoto, Misa / Koriyama, Tomoki / Kobayashi, Takao (2013): "A style control technique for singing voice synthesis based on multiple-regression HSMM", In INTERSPEECH-2013, 378-382.