EUROSPEECH 2001 Scandinavia
In the last few years we have been experimenting with an automatic phonetic segmentation and labeling system based on a modified HMM phonetic recognizer followed by a local phonetic boundary refinement system. During this period we have used different approaches for the local refinement, including fuzzy rules and neural networks. In this paper we present a unified framework for the local refinement of phonetic boundaries that has allowed us to thoroughly evaluate and compare these approaches and yet another one based on gaussian mixture models. Results show that neural networks outperform the rest of the approaches in speaker dependent mode, achieving a precision almost equal to a manual segmentation. In speaker independent mode, however, neural networks and fuzzy rules achieve almost the same performance, a bit worse than a manual segmentation.
Bibliographic reference. Toledano, Doroteo Torre / Gómez, Luis A. Hernández (2001): "Local refinement of phonetic boundaries: a general framework and its application using different transition models", In EUROSPEECH-2001, 1695-1698.