5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Formant Diphone Parameter Extraction Utilising a Labelled Single-Speaker Database

Robert H. Mannell

SHLRC, Macquarie University, Australia

This paper examines a method for formant parameter extraction from a labeled single speaker database for use in a formant-parameter diphone-concatenation speech synthesis system. This procedure commences with an initial formant analysis of the labelled database, which is then used to obtain formant (F1-F5) probability spaces for each phoneme. These probability spaces guide a more careful speaker-specific extraction of formant frequencies. An analysis-by-synthesis procedure is then used to provide best-matching formant intensity and bandwidth parameters. The great majority of the parameters so extracted produce speech which is highly intelligible and which has a voice quality close to the original speaker.

Full Paper   Sound Example #1   Sound Example #2  

Bibliographic reference.  Mannell, Robert H. (1998): "Formant diphone parameter extraction utilising a labelled single-speaker database", In ICSLP-1998, paper 0627.