In this paper an integrated software system is described for the conversion of speech into graphic animation suitable to lipreading. The objective of the work has been that of integrating the unimodal audio information conveyed through acoustic speeech with coherent visual information associated to the movements of the speaker's mouth. Thanks to this integration, the occurrence of perceptive impairments affecting the auditory channel, i.e. environmental noise, age or handicaps, can be bypassed effectively through the visual modality. The described system implements a direct mapping from audio to visual parameters through two processing stages responsible of speech analysis and visual synthesis, respectively. The specific architecture employed for implementing the conversion structure, based on Time-Delay neural networks, has been chosen in consideration of the bimodal nature of speech and, in particular, of the complex coarticulation phenomena which require the integration of a given amount of past acoustic information before performing the visual conversion. Preliminary experimental results have reported showing the concrete possibility of providing new services in telecommunication and rehabilitation.
Bibliographic reference. Lavagetto, Fabio / Lavagetto, Paolo (1995): "A new algorithm for visual synthesis of speech", In EUROSPEECH-1995, 303-306.