5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

The Demiphone: An efficient Subword Unit for Continuous Speech Recognition

Josť B. Marino, Albin Nogueiras, Antonio Bonafonte

Universitat Politecnica de Catalunya, Barcelona, Spain

In this paper we introduce the demiphone as a contextual phonetic unit for continuous speech recognition. A phone is divided into two parts: a left demiphone that accounts for the left side coarticulation and a right demiphone that copes with the right side context. This new unit discards the dependence between the effects of both side contexts, but provides a better training of the transition between phones. The demiphone can be seen as a heuristic clustering of states that allows a more smoothed training of hidden Markov models and additionally supplies a simple way to create unseen triphones. We report experimental evidence that demiphones outperform the usual combination of triphones, right-side and left-side biphones and monophones.

Full Paper

Bibliographic reference.  Marino, Josť B. / Nogueiras, Albin / Bonafonte, Antonio (1997): "The demiphone: an efficient subword unit for continuous speech recognition", In EUROSPEECH-1997, 1215-1218.