EUROSPEECH '97
5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997


Characteristics of Slow, Average and Fast Speech and Their Effects in Large Vocabulary Continuous Speech Recognition

Fernando Martinez, Daniel Tapias, Jorge Alvarez, Paloma Leon

Speech Technology Group Telefonica Investigacion y Desarrollo, S.A., Madrid, Spain

In this paper we report the characteristics of slow, average and fast speech. The study has been done using the TRESVEL Spanish database. It is composed of 3200 sentences uttered at three different speech rates and contains speech material from 20 male and 20 female speakers. This database has been designed to study, evaluate and compensate the effect of speech rate in Large Vocabulary Continuous Speech Recognition (LVCSR) systems. We report a new measure for the rate of speech (ROS). The ROS is normalised using an appropriate set of constants that depends on the expected duration of each phone. We also report the characteristics of slow, average and fast speech. Finally, we report the degradation in performance of a continuous speech recognition system when the speech rate is low and high, and the evaluation of two compensation techniques. Adaptation of the language weight, insertion penalties and HMM state-transition probabilities for slow speech provides a 21.5% reduction of the word error rate (WER).

Full Paper

Bibliographic reference.  Martinez, Fernando / Tapias, Daniel / Alvarez, Jorge / Leon, Paloma (1997): "Characteristics of slow, average and fast speech and their effects in large vocabulary continuous speech recognition", In EUROSPEECH-1997, 469-472.