4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
In this paper we investigate an automatic method to segment labeled speech. The method needs an initial estimation of the segmentation which is provided by an alignment based on HMM. Afterwards, the boundaries are refined moving the frontier frames to the segment which is more similar to the speech frame. Gaussian pdf are used as a similarity measure. The performance of the method is evaluated using the TIMIT database. If boundary deviation (from the reference position) larger than 20 ms. are counted as errors, then the replacement of the boundaries reduces the error in a 30%. Additional experiments show how the proposed method turns the performance quite independent of the speaker dependent or speaker independent data used to estimate the HMM.
Bibliographic reference. Bonafonte, Antonio / Nogueiras, Albino / Rodriguez-Garrido, Antonio (1996): "Explicit segmentation of speech using Gaussian models", In ICSLP-1996, 1269-1272.