13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Synthetic F0 Can Effectively Convey Speaker ID in Delexicalized Speech

Eric Morley, Esther Klabbers, Jan P. H. van Santen, Alexander Kain, Seyed Hamidreza Mohammadi

Center for Spoken Language Understanding, Oregon Health & Science University Portland, OR, USA

We investigate the extent to which F0 can convey speaker ID in the absence of spectral, segmental, and durational information. We propose two methods of F0 synthesis based on the Linear Alignment Model (LAM, van Santen 2000): one parametric, the other corpus-based. Through a perceptual experiment, we show that F0 alone is able to convey information about speaker ID. We find that F0 synthesized with either LAMbased method conveys speaker ID almost as effectively as natural F0.

Index Terms: F0, prosody, speech synthesis, speaker identity, recombinant synthesis

Full Paper

Bibliographic reference.  Morley, Eric / Klabbers, Esther / Santen, Jan P. H. van / Kain, Alexander / Mohammadi, Seyed Hamidreza (2012): "Synthetic F0 can effectively convey speaker ID in delexicalized speech", In INTERSPEECH-2012, 434-437.