This paper explores a framework to incorporate articulatory movement information into a classical ASR scheme based on the concept of articulatory stroke. Articulatory stroke is a geometrical segmental unit which corresponds to a target approaching-releasing articulatory gesture. It has been shown that critical and non-critical (i.e., secondary or dummy) articulatory gestures can be classified with about 88% accuracy using the stroke parameters. Phonetic recognition accuracy is also investigated by augmenting the conventional MFCC features with the articulatory stroke features (obtained using the MOCHA corpus). It is found that the phonetic recognition accuracy increases 15% with respect to the best result using the ordinary MFCC parameters only. This provides supporting evidence for the usefulness of the articulatory stroke representation of articulatory movements not only for speech production description but also for automatic speech recognition.
Bibliographic reference. Molina, Carlos / Lee, Sungbok / Narayanan, Shrikanth / Yoma, Néstor Becerra (2011): "A study of the effectiveness of articulatory strokes for phonemic recognition", In INTERSPEECH-2011, 2513-2516.