4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
A new scheme is proposed to incorporate prosodic processing into speech recognition, where the accent nuclei at the head of words are detected automatically and used to limit the searching space in speech recognition, that is, to preselect candidate words. Especially in this paper, the proposed method for the automatic detection of the accent nuclei and its performance are described. Using this scheme, it is expected that the recognition speed is improved. This scheme is derived from a finding by perceptual experiments conducted previously by the first author. Results of the experiments indicated that the accent nucleus at the first mora has acceleration effect on perceiving the word. This effect can be explained by the earlier identification of the word accent type as type 1 by its nucleus at the first mora. In other words, the accent nucleus at the head of a word can limit the searching space effectively in the mental lexicon. This mechanism was implemented using HMMs and examined for isolated words on a machine, where the vowel detection by broad segmental features and the rejection of words with a devoiced vowel at the first or second mora were introduced at the same time. Evaluation experiments showed 94.7% and 90.0% as recall factor and precision factor of the accent nucleus detection respectively.
Bibliographic reference. Minematsu, Nobuaki / Nakagawa, Seiichi (1996): "Automatic detection of accent nuclei at the head of words for speech recognition", In ICSLP-1996, 1620-1623.