The standard HMM cannot express the time variant features during staying at the same state. We tried to capture the dynamic changes by using segmental statistics. We propose a new speech recognition method by the combination of HMM and segmental statistics. Using segmental statistics, since the dimension of parameters increases, it results in a lesser precision in the estimation of covariance matrix. Therefore we used methods for compressing dimension and reducing computation, such as K-L expansion and Modified Quadratic Discriminant Function(MQDF). This method outperformed traditional methods such as a conditional density HMM with the correlation between two frames and an HMM using regression coefficients as dynamic features.
Bibliographic reference. Yamamoto, Kazumasa / Nakagawa, Seiichi (1995): "Comparative evaluation of segmental unit input HMM and conditional density HMM", In EUROSPEECH-1995, 1615-1618.