INTERSPEECH 2013
14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Detecting Words in Speech Using Linear Separability in a Bag-of-Events Vector Space

Maarten Versteegh (1), Louis ten Bosch (2)

(1) IMPRS for Language Sciences, The Netherlands
(2) Radboud Universiteit Nijmegen, The Netherlands

This paper studies the properties of the Histograms of Acoustic Co-occurrences (HAC) approach to acoustic modeling. While HACvectors have been predominantly used with matrix decomposition algorithms, we show that the additivity and sparseness constraints inherent in HAC lead to a representational space in which utterances are linearly separable with respect to the words they contain. The method implies that it is possible to detect and locate words in test utterances by using a classifier that is trained without any information about word order or word location during training. We demonstrate this by showing that an ensemble of linear classifiers can reach excellent detection scores on the TIDIGITS dataset. We further explore the usefulness of the linear separability in the HAC space by demonstrating the use of a sliding window decoder for continuous speech recognition.

Full Paper

Bibliographic reference.  Versteegh, Maarten / Bosch, Louis ten (2013): "Detecting words in speech using linear separability in a bag-of-events vector space", In INTERSPEECH-2013, 680-684.