Second International Conference on Spoken Language Processing (ICSLP'92)

Banff, Alberta, Canada
October 13-16, 1992

Vowel Classification Based on Analysis-by-Synthesis

Rolf Carlson, James Glass

Spoken Language Systems Group, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA

In this paper we report on a sequence of experiments designed to explore the use of analysis-by-synthesis methods for speech recognition and speech analysis in general. An intermediate representation of the speech signal is formulated in terms of speech-synthesis-like parameters.

Using an multi-layer perceptron as a common classifier, we have performed several vowel classification experiments based on these parameters. The results of the experiments indicate that we are able to obtain the same classification performance as a more traditional spectral representation using nearly an order of magnitude fewer dimensions.

We have also developed a speaker normalization procedure that improves classification rate compared to the one we obtain with a simple male/female normalization.

In our last set of experiments we have studied the influence of the context on the classification result. The best classification results in our experiments were achieved by a combination of default formants and labels specifying the context together with speaker normalization of the automatically measured synthesis parameters.

Full Paper

Bibliographic reference.  Carlson, Rolf / Glass, James (1992): "Vowel classification based on analysis-by-synthesis", In ICSLP-1992, 575-578.