This paper studies different sets of subword speech units to be used for recognizing Spanish. In particular it compares context dependent phones, syllables and demisyllables. It shows how context dependent units can effectively reduce the error in a 15% with respect to context independent phones. The benefit of merging similar contexts when there are not enough training data is also validated. On the other hand the paper study the behavior of syllables based units: first, the study reveals that syllables give a similar performance than triphones whereas demisyllables give a similar performance than right (or left) context dependent phones. However, when different types of units are used, context dependent phones give the best results. Results achieved with these sets of units exceed 70% in acoustic-phonetic decoding of Spanish speech.
Bibliographic reference. Bonafonte, Antonio / Estany, Rafael / Vives, Eugenio (1995): "Study of subword units for Spanish speech recognition", In EUROSPEECH-1995, 1607-1610.