First European Conference on Speech Communication and Technology

Paris, France
September 27-29, 1989

Phoneme Segmentation of Speech, Based on Temporal Decomposition Using Band Filter Spectra and Phonetic Rules

E. J. M. van Mierlo, E. Blaauw, Gerrit Bloothooft

Research Institute for Language and Speech, University of Utrecht, Utrecht, The Netherlands

Temporal Decomposition (TD) is a technique that can be used to segment speech. Although the technique is most often used with log-area coefficients (or ratios) as input parameters, in relation to an articulatory interpretation of the results, this is not a necessary condition. In this study, the output of a filter bank is used as input for TD. Our aim was not an articulatory interpretation, but simply a type of data reduction which may be helpful for dividing the speech signal into phonetically relevant segments. It is known that TD generally produces more segments than may be expected, on the basis of phonetic expert judgements, while only in a few cases too few segments are found. TD on the basis of band filter energies shows this same behaviour, but in addition the spectra which are assigned to each segment are better interpretable because intensity information is preserved. This is not the case when log-area coefficients (or ratios) are used. The intensity information is probably useful for an improved phoneme segmentation and classification.

