We investigate the extent to which F0 can convey speaker ID in the absence of spectral, segmental, and durational information. We propose two methods of F0 synthesis based on the Linear Alignment Model (LAM, van Santen 2000): one parametric, the other corpus-based. Through a perceptual experiment, we show that F0 alone is able to convey information about speaker ID. We find that F0 synthesized with either LAMbased method conveys speaker ID almost as effectively as natural F0.
Index Terms: F0, prosody, speech synthesis, speaker identity, recombinant synthesis
Bibliographic reference. Morley, Eric / Klabbers, Esther / Santen, Jan P. H. van / Kain, Alexander / Mohammadi, Seyed Hamidreza (2012): "Synthetic F0 can effectively convey speaker ID in delexicalized speech", In INTERSPEECH-2012, 434-437.