4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
An audiovisual speech synthesizer from unlimited French text is here presented. It uses a 3-D parametric model of the face. The facial model is controlled by eight parameters. Target values have been assigned to the parameters, for each French viseme, based upon measurements made on a human speaker. Parameter trajectories are modeled by means of dominance functions associated with each parameter and each viseme. A dominance function is characterized by three coefficients so that coarticulation finally depends on the phonetic context, the speech rate, and an "hypo-hyper articulation" coefficient adjustable by the user. Finally, the visual and audiovisual intelligibility of our visual synthesizer has been evaluated in its first version, and compared to that of the acoustic synthesizer on which it was implemented.
Bibliographic reference. Goff, Bertrand Le / Benoît, Christian (1996): "A text-to-audiovisual-speech synthesizer for French", In ICSLP-1996, 2163-2166.