This paper demonstrates a method of evaluating the output from automatic speech segmentation systems. The methodology is exemplified by experiments performed on the output of a Hidden Markov model based automatic segmentation system developed at the Centre for Speech Technology Research. Subject to comparison were the differences in allocated phoneme boundaries between hand-segmented and automatically segmented speech of the same utterances. The results of this large-scale evaluation on almost 6000 phonemes facilitate a qualitative analysis of each HMM and allow the identification of bad automatic segmentations with respect to particular contexts. This type of analysis provides the platform for an iterative optimization and fine tuning of the models, in that HMM-training for subsequent automatic segmentations can be tailored to contain more examples of a segment (perhaps in a specific context), for which the performance of the system was unsatisfactory. Furthermore, the use of standardized methods of evaluation allows for objective cross-comparison between different automatic segmentation systems.
Bibliographic reference. Schmidt, M. S. / Watson, G. S. (1991): "The evaluation and optimization of automatic speech segmentation", In EUROSPEECH-1991, 701-704.