4th International Conference on Spoken Language Processing
Philadelphia, PA, USA
In HMM-based speech recognition, estimation of parameters of HMMs is viewed as counterpart of training or learning in traditional sequential pattern recognition since speech signal can be represented by a sequence of n-dimension vectors after features are extracted from the speech signal. However, due to variation of duration of the phone with speakers and context and its randomness, speech samples contribute differently to estimation of parameters of HMMs. While only smaller training set is accessible, for instance, in the case of speaker adaptation, the problem becomes very serious. In this paper, we analyze the impact of different duration of the phone on the output probability likelihood. To combat the above problem, two approaches are proposed to make proportionate the contribution of speech samples to estimation of parameters of HMM: geometrically averaged probability likelihood method and centralized parametric space method. Several experiments are conducted to verify the advantage of the above approaches in HMM-based speech recognition. The results show that the recognition rate can be improved to a certain degree when any one of the above approaches is employed.
Bibliographic reference. Li, Gongjun / Huang, Taiyi (1996): "An improved training algorithm in HMM-based speech recognition", In ICSLP-1996, 1057-1060.