Speech Synthesis


Speech Synthesis has a long history: There is a good introduction by Andrew Maas in the May 17 Lecture  of the Stanford Spoken Language Processing course

Simon King's Course on Speech Synthesis is a comprehensive multimedia presentation.

Most attention has been paid to ‘Text-to-Speech’ applications, i.e. type in the words you want and have them spoken for you.

Speech synthesis systems are evaluated in terms of intelligibility (how many words are correctly identified by listeners?) and naturalness (to what extent does the synthesis resemble a normal human voice?).

Visual overview of SCOOT Synthesis Topics.

Text to Speech

Synthesis from written text (orthography) involves 2 stages:

  • ·        Text Analysis – transform the text into an intermediate representation
  • ·        Waveform Generation – render the speech acoustics from the intermediate representation

To a large extent these stages are independent.

For an introduction to Text Analysis,  see

Waveform generation methods subdivide into

Synthesis Toolkits

Provide software for generating synthetic speech