14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

A Method for Structure Estimation of Weighted Finite-State Transducers and its Application to Grapheme-to-Phoneme Conversion

Yotaro Kubo, Takaaki Hori, Atsushi Nakamura

NTT Corporation, Japan

Weighted finite-state transducers (WFSTs) are widely used as a fundamental data structure in several spoken language processing systems since they can provide a unified representation of many types of probabilistic models. Even though the use of accurate WFSTs is important in many spoken language systems, WFSTs are conventionally obtained by transforming probabilistic models that are not estimated in terms of WFST accuracy. Several recent techniques have enabled the direct optimization of weight parameters in WFSTs; however, these techniques do not optimize the structures of WFSTs directly. In this paper, with the goal of achieving a direct estimation of WFST structures from a dataset, we introduce a Bayesian method for structure inference. The proposed method employs the hierarchical Dirichlet process (HDP) as a prior process of generative processes of arcs in the WFSTs. Thanks to the flexibility of the HDP that enables the handling of countably infinite entities, the proposed method can potentially generate the infinite number of arcs in the WFSTs. The efficiency of the proposed method is verified by estimating WFSTs for grapheme-tophoneme (G2P) conversion. We confirmed that the WFST obtained by the proposed method realized a compact representation of G2P conversion compared with the conventional N-gram-based G2P models.

Full Paper

Bibliographic reference.  Kubo, Yotaro / Hori, Takaaki / Nakamura, Atsushi (2013): "A method for structure estimation of weighted finite-state transducers and its application to grapheme-to-phoneme conversion", In INTERSPEECH-2013, 647-651.