5th International Conference on Spoken Language Processing
At ICSLP'96 we presented a flexible, large vocabulary, speaker independent, isolated-word preselection system in a telephone environment, using a two stage, bottom-up strategy. We achieved reasonable performance in large and very large vocabulary tasks, ranging from 1200 to 10000 words. In this paper, we will describe recent studies we have carried out on the system, aimed in two directions: handling of non speech sounds in the speech signal (we consider lips, respiration and click noises); and making the preselection lists dynamic in length, to reduce computational load, in the average. In the first case, we want to model non speech sounds, as these effects are crucial in real-life situations, leading to wrong endpointing and increasing error rates. In the second, we are interested in integrating any available system parameter to calculate the preselection list length to use, having applied both parametric and non parametric methods.
Bibliographic reference. Ferreiros, Javier / Macias-Guarasa, Javier / Gallardo, Ascension / Colas, José / Cordoba, Ricardo / Pardo, José Manuel / Villarrubia, Luis (1998): "Recent work on a preselection module for a flexible large vocabulary speech recognition system in telephone environment", In ICSLP-1998, paper 0987.