4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
This paper presents statistical analyses of context-dependent phone durations using the hand-segmented TIMIT database, for the purpose of improving automatic speech recognition. Two main approaches were used. (1) Duration distributions were found under the influence of individual contextual factors, such as broader classes specified by long or short vowels, word stress, syllable position within the word and within an utterance, postvocalic consonants, and utterance speaking rate. (2) A hierarchically structured analysis of variance was used to study the numerical contributions of 11 different contextual factors to the variation in duration.
Bibliographic reference. Wang, Xue / Pols, Louis C. W. / Bosch, Louis F. M. ten (1996): "Analysis of context-dependent segmental duration for automatic speech recognition", In ICSLP-1996, 1181-1184.