EUROSPEECH 2001 Scandinavia
An efficient method to reduce the amount of feature data for real-time automatic image-transform-based lipreading is proposed. Image-transform-based approach obtaining a compressed representation of image pixel values of speaker's mouth is reported to show superior lipreading performance. However, since this approach produces many feature vectors relevant to lip information, it requires much computation time for lipreading even when principal component analysis (PCA) is applied. To reduce the computational load efficiently, we propose an algorithm that utilizes the symmetry of the lip. The proposed method reduces the amount of required feature vectors up to 51% compared to the original one. Also, it improves the recognition rates by compensating the variation of illumination. With our database (22 words, 70 talkers) recorded in a natural environment, our method achieved an accuracy of 53.5% for visual-only speaker independent word recognition task. The extracted features are modeled by hidden Markov models with Gaussian mixture distributions.
Bibliographic reference. Lee, Joohun / Kim, JinYoung (2001): "An efficient lipreading method using the symmetry of lip", In EUROSPEECH-2001, 1019-1022.