4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
In this paper, we propose an improved deformable template algorithm for modeling the shape of a talker's mouth. We use a two step approach which begins by classifying mouth images into broad categories. The classification procedure yields both a set of template parameters (in effect, a unique template) and a set of initial conditions. The second step is to allow the deformable template to converge using standard techniques. The multi-model approach is significantly more flexible than single-model approaches and consistently provides better solutions. We present examples of single and multiple template solutions which support this statement. In a small recognition experiment, recognition of consonants improved from 16% to 33%, based only on visual information, when multiple templates were used.
Bibliographic reference. Chandramohan, Devi / Silsbee, Peter L. (1996): "A multiple deformable template approach for visual speech recognition", In ICSLP-1996, 50-53.