Speech Prosody 2012

Shanghai, China
May 22-25, 2012

Automatic Segmentation of English Words using Phonotactic and Syllable Information

Raymond W. M. Ng, Keikichi Hirose

Graduate School of Information Science and Technology, The University of Tokyo, Japan

It is difficult to demonstrate the effectiveness of prosodic features in automatic word recognition. Recently, we applied the suprasegmental concept and proposed an extra layer of acoustic modeling with syllables. Nevertheless, there is a mismatch between the syllable and the word units and that makes subsequent steps after acoustic modeling difficult. In this study, we explore English word segmentation without a pronunciation dictionary. The algorithm is based on phonotactic and pseudosyllable information trained on a direct model with conditional random fields. An F-measure of 0:69 is attained. This result opens the possibility of automatic word recognition with the extra layer of syllable modeling.

Index Terms: automatic word recognition, word segmentation, pseudosyllable

Full Paper

Bibliographic reference.  Ng, Raymond W. M. / Hirose, Keikichi (2012): "Automatic segmentation of English words using phonotactic and syllable information", In SP-2012, 27-30.