INTERSPEECH 2006 - ICSLP
Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

Minimum Boundary Error Training for Automatic Phonetic Segmentation

Jen-Wei Kuo, Hsin-Min Wang

Academia Sinica, Taiwan

Annotated speech corpora are indispensable to various areas of speech research. In this paper, we present a novel discriminative training approach for HMM-based automatic phonetic segmentation. The objective of the proposed minimum boundary error (MBE) discriminative training approach is to minimize the expected boundary errors over a set of phonetic alignments represented as a phonetic lattice. This approach is inspired by the recently proposed minimum phone error (MPE) training algorithm for automatic speech recognition. To evaluate the MBE training approach, we conducted automatic phonetic segmentation experiments on the TIMIT acoustic-phonetic continuous speech corpus. The MBE-trained HMMs can identify 79.75% of human-labeled phone boundaries within a tolerance of 10 ms, compared to 71.23% identified by the conventional ML-trained HMMs. Moreover, by using the MBE-trained HMMs, only 7.89% of automatically labeled phone boundaries have errors larger than 20 ms.

Full Paper

Bibliographic reference.  Kuo, Jen-Wei / Wang, Hsin-Min (2006): "Minimum boundary error training for automatic phonetic segmentation", In INTERSPEECH-2006, paper 1497-Tue3A1O.5.