13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

An Alignment Matching Method to Explore Pseudosyllable Properties Across Different Corpora

Raymond W. M. Ng (1), Thomas Hain (1), Keikichi Hirose (2)

(1) Department of Computer Science, The University of Sheffield, Sheffield, UK
(2) Graduate School of Information Science and Technology, The University of Tokyo, Japan

A pseudosyllable unit was derived for English read speech recognition. It is a question whether the pseudosyllable unit can be extracted in a robust manner and how this unit could help in the speech recognition process by providing some indications to the error pattern. In this study, an evaluation method which maps every hypothesis phoneme to every reference is proposed. Analysis is done to the pseudosyllables extracted from two different sets of speech data. Mutual information is used to look at the relationship between different pseudosyllable aspects and error pattern of the hypothesis phoneme. It was shown that the pseudosyllable extraction algorithm is robust and gives units with consistent nature. Pseudosyllables which have a phone triplet structure tends to have lower insertion. Pseudosyllables which overlap with their neighbours are places where more insertion errors may occur.

Index Terms: pseudosyllable, error analysis, mutual information, speech recognition

Full Paper

Bibliographic reference.  Ng, Raymond W. M. / Hain, Thomas / Hirose, Keikichi (2012): "An alignment matching method to explore pseudosyllable properties across different corpora", In INTERSPEECH-2012, 863-866.