Systems for speech training have concentrated almost exclusively on presenting analysis of stationary speech parameters (formants, vocal tract shape, pitch, etc.) in visual form, or giving feedback based on vowel or utterance recognition . Consonants are, however, usually given more attention during therapy. This is primarily due to their greater complexity (increased articulator agility is required to produce consonants) and their importance for intelligibility. We have developed a system based on dynamic selection from a variety of feature generation options, and a detailed selection of regions over time and frequency using an error rate criterion which allows the system to focus on aspects of speech patterns that are most effective for phoneme pair discrimination. Tests with a set of 14 minimal consonant pairs resulted in discrimination error rates of 0.4% for this system vs. 2.7% for a knn classifier and 5.7% for a neural network trained with backpropagation. Prototype application with a version of this algorithm utilizing only the expanded feature set indicates clinically useful performance with head injury and stroke patients.
Bibliographic reference. Glassman, Martin S. / Starkey, Mary Beth (1989): "Minimal consonant pair discrimination for speech therapy using an expanded feature set and pattern element selection in time and frequency", In EUROSPEECH-1989, 2273-2276.