Ninth International Conference on Spoken Language Processing

Pittsburgh, PA, USA
September 17-21, 2006

The Use of Bayesian Network for Incorporating Accent, Gender and Wide-Context Dependency Information

Sakriani Sakti, Konstantin Markov, Satoshi Nakamura

National Institute of Information and Communications Technology, Japan; and ATR Spoken Language Communication Research Laboratories, Japan

We propose a new method of incorporating the additional knowledge of accent, gender, and wide-context dependency information into ASR systems by utilizing the advantages of Bayesian networks. First, we only incorporate pentaphone-context dependency information. After that, accent and gender information are also integrated. In this method, we can easily extend conventional triphone HMMs to cover various sources of knowledge. The probabilistic dependencies between a triphone context unit and additional knowledge are learned through a BN. Another advantage is that during recognition, additional knowledge variables are assumed to be hidden, so that the existing standard triphone-based decoding system can be used without modification. The performance of the proposed model was evaluated on an LVCSR task using two different types of accented English speech data. Experimental results show that this proposed method improves word accuracy with respect to standard triphone models.

Full Paper

Bibliographic reference.  Sakti, Sakriani / Markov, Konstantin / Nakamura, Satoshi (2006): "The use of Bayesian network for incorporating accent, gender and wide-context dependency information", In INTERSPEECH-2006, paper 1812-Wed1BuP.4.