This paper proposes a novel technique to exploit discriminative models with subclasses for speech recognition. Speech recognition using discriminative models has attracted much attention in the past decade. However, most discriminative models are still based on tree clustering results of HMM states. On the contrary, our proposed method, referred to as subclass AdaBoost, jointly selects optimal data split and weak discriminators in each iteration of the training process, and forms a weak classifier as a composite of these weak discriminators. As a result, a strong discriminator robust to a variety of subclasses is constructed without explicit clustering in advance. In the experiment, the subclass AdaBoost is applied to phoneme classification, and N-best hypotheses are rescored using the subclass AdaBoost phoneme classifiers. Experimental results show that the proposed method reduces word errors by over 10% relatively in a continuous speech recognition task.
Bibliographic reference. Fujimura, Hiroshi / Shinohara, Yusuke / Masuko, Takashi (2013): "N-best rescoring by phoneme classifiers using subclass adaboost algorithm", In INTERSPEECH-2013, 3327-3331.