Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Recent Advances in Speech Fragment Decoding Techniques

Jon Barker, André Coy, Ning Ma, Martin Cooke

University of Sheffield, UK

This paper addresses the problem of recognising speech in the presence of a competing speaker. We employ a speech fragment decoding technique that treats segregation and recognition as coupled problems. Data-driven techniques are used to segment a spectro-temporal representation into a set of spectro-temporal fragments, such that each fragment is dominated by one or other of the speech sources. A speech fragment decoder is used which employs missing data techniques and clean speech models to simultaneously search for the set of fragments and the word sequence that best matches the target speaker model. The paper reports recent advances in this technique, and presents an evaluation based on artificially mixed speech utterances. The fragment decoder produces significantly lower error rates than a conventional recogniser, and mimics the pattern of human performance whereby performance increases as the target-masker ratio is reduced below -3 dB.

Full Paper

Bibliographic reference.  Barker, Jon / Coy, André / Ma, Ning / Cooke, Martin (2006): "Recent advances in speech fragment decoding techniques", In INTERSPEECH-2006, paper 1479-Mon1WeS.4.