5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Continuous Speech Recognition Using Syllables

Rhys James Jones (1), Simon Downey (2), John S. Mason (1)

(1) Speech Research Group, Department of Electrical & Electronic Engineering, University of Wales, Swansea, UK
(2) Speech Technology Unit, BT Laboratories, Martlesham Heath, Ipswich, Suffolk, UK

The vast majority of work in continuous speech recognition uses phoneme-like units as the basic recognition component. The work presented here investigates the practicability of syllable-like units as the building blocks for recognition. A phonetically annotated telephony database is analysed at the syllable level, and a set of syllable-based HMMs are built. Refinements including the introduction of syllable-level bigram probabilities, word- and syllable- level insertion penalties, and the investigation of different model topologies are found to improve recogniser performance. It is found that the syllable-based recogniser gives recognition accuracies of over 60%, which compares with 35% as the baseline accuracy for monophone recognition. It is envisaged that practical applications of syllable recognition could be in a hybrid system, where the most common syllable HMMs would be used in conjunction with whole- word and phoneme models.

Full Paper

Bibliographic reference.  Jones, Rhys James / Downey, Simon / Mason, John S. (1997): "Continuous speech recognition using syllables", In EUROSPEECH-1997, 1171-1174.