The Seventh ISCA Tutorial and Research Workshop on Speech Synthesis

Kyoto, Japan
September 22-24, 2010

Considering Readability in Text-to-Speech Recording Script Design

Minghui Dong, Ling Cen, Paul Chan, Haizhou Li

Human Language Technology Department, Institute for Infocomm Research, A*STAR, Singapore 138632

Designing text scripts that cover enough phonetic units and prosodic phenomena is very important when recording speech database for corpus based speech synthesis. When designing recording scripts for speech synthesis databases, a lot of effort is often placed on how to achieve maximal coverage of phonetic units in minimal speech recording. With such methods, sentences with difficult words or incorrect grammar are often selected. It is difficult for speakers to read these sentences correctly and naturally. Also, the selected sentences may not be suitable for child speakers or non-native speakers. In order to address these problems, we propose to consider readability in text selection. The experiment shows that the selected scripts with the proposed method have good unit coverage of the language and good readability.

Index Terms: Text-to-speech, recording scripts, text selection, text readability

Full Paper

Bibliographic reference.  Dong, Minghui / Cen, Ling / Chan, Paul / Li, Haizhou (2010): "Considering readability in text-to-speech recording script design", In SSW7-2010, 312-316.