5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

On the Use Of Phone Duration and Segmental Processing to Label Speech Signal

Philippe Depambour (1), Regine Andre-Obrecht (1), Bernard Delyon (2)

(1) IRIT - UMR CNRS 5505, Toulouse Cedex, France (2) IRISA, Campus universitaire de Beaulieu, Rennes Cedex, France

This paper presents recent work on continuous speech labelling. We propose an original automatic labelling system where elementary phone models take a segmental analysis and the phone duration into account. These models are initialized by a short speaker-independent training stage in order to constitute a model database. From the standard phonetic transcription, phonological rules are gathered to process the various pronunciations. For each new corpus or speaker, a new quick unsupervised adaptation stage is performed to re-estimate the models, and then follows the correct labelling. We assess this system by labelling a difficult corpus (sequences of connected spelled letter) and sentences of one speaker of the BREF80 corpus. These results are quite promising, in the two experiments less than 9% of phonetic boundaries are incorrectly located.

Full Paper

Bibliographic reference.  Depambour, Philippe / Andre-Obrecht, Regine / Delyon, Bernard (1997): "On the use of phone duration and segmental processing to label speech signal", In EUROSPEECH-1997, 1627-1630.