5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Resegmentation of SWITCHBOARD

Neeraj Deshmukh, Aravind Ganapathiraju, Andi Gleeson, Jonathan Hamaker, Joseph Picone

Institute for Signal and Information Processing, Mississippi State University, USA

The SWITCHBOARD (SWB) corpus is one of the most important benchmarks for recognition tasks involving large vocabulary conversational speech (LVCSR). The high error rates on SWB are largely attributable to an acoustic model mismatch, the high frequency of poorly articulated monosyllabic words, and large variations in pronunciations. It is imperative to improve the quality of segmentations and transcriptions of the training data to achieve better acoustic modeling. By adapting existing acoustic models to only a small subset of such improved transcriptions, we have achieved a 2% absolute improvement in performance.

Full Paper

Bibliographic reference.  Deshmukh, Neeraj / Ganapathiraju, Aravind / Gleeson, Andi / Hamaker, Jonathan / Picone, Joseph (1998): "Resegmentation of SWITCHBOARD", In ICSLP-1998, paper 0685.