Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

Speech Recognition Using Tree-Structured Probability Density Function

Takao Watanabe, Koichi Shinoda, Keizaburo Takagi, Eiko Yamada

Information Technology Research Laboratories, NEC Corporation, Kawasaki, Japan

This paper proposes a new speech recognition method using tree-structured probability density functioned (pdf) to realize high speed HMM based speech recognition. In order to reduce likelihood calculation for a pdf set composed of the Gaussian pdjs for all mixture components, all states and all recognition units, the likelihood calculation is coarsely done for the element pdf (element of the pdf set) whose likelihood Nk[xt] at time t is not likely to be large. The pdf set is expressed as the tree-structured form. A leaf node of the tree corresponds to an element pdf. A non-leaf node corresponds to a cluster composed of element pdfs. To each cluster is attached a cluster obtained by approximating the mixture of all element pdfs in the cluster by a single Gaussian pdf In the recognition, the likelihood set is calculated by searching the tree; by calculating the likelihood from the cluster pdf at the node and traversing the nodes with the largest likelihood from the root node. Recognition experiments showed that the amount of computation was drastically reduced by the proposed method with little degradation in the recognition accuracy for both speaker-independent and speaker-adaptive modes.

Full Paper

Bibliographic reference.  Watanabe, Takao / Shinoda, Koichi / Takagi, Keizaburo / Yamada, Eiko (1994): "Speech recognition using tree-structured probability density function", In ICSLP-1994, 223-226.