Subword and Crossword Units for CTC Acoustic Models

Thomas Zenkel, Ramon Sanabria, Florian Metze, Alex Waibel

This paper proposes a novel approach to create a unit set for CTC-based speech recognition systems. By using Byte-Pair Encoding we learn a unit set of an arbitrary size on a given training text. In contrast to using characters or words as units this allows us to find a good trade-off between the size of our unit set and the available training data. We investigate both Crossword units, that may span multiple word and Subword units. By evaluating these unit sets with decodings methods using a separate language model we are able to show improvements over a purely character-based unit set.

 DOI: 10.21437/Interspeech.2018-2057

Cite as: Zenkel, T., Sanabria, R., Metze, F., Waibel, A. (2018) Subword and Crossword Units for CTC Acoustic Models. Proc. Interspeech 2018, 396-400, DOI: 10.21437/Interspeech.2018-2057.

