Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

The Distance Set Representation of Speech Segments

Ramesh R. Sarukkai, Dana H. Bollard

Dept. of Computer Science, University of Rochester, Rochester, NY, USA

This paper evaluates the discriminative ability of nonsequential representations of speech segments. Variable duration speech segments are represented as fixed dimensional set representations, which are then clustered using LVQ2.1 to provide nonsequential templates. Several set representations are evaluated: 1) quantized vector set representation which indicate the presence/ absence of a vector quantized index in the speech segment; 2) a multi-set counting set representation which encodes the number of occurrences of the quantized features within the speech segments; 3) the distance set representation wherein each feature present in a particular speech segment is represented as a Gaussian probability distribution based on the Euclidean distance to every codebook vector; these distributions are then averaged to provide a template distance set representation for that particular speech token.

Full Paper

Bibliographic reference.  Sarukkai, Ramesh R. / Bollard, Dana H. (1995): "The distance set representation of speech segments", In EUROSPEECH-1995, 1427-1430.