Second International Conference on Spoken Language Processing (ICSLP'92)
Banff, Alberta, Canada
Classification by semicontinuous hidden Markov models (SCHMM) achieves very good performance in speaker dependent mode, but the performance decreases considerably in the speaker independent mode. Hence there is a need for a rapid and non-supervised adaptation of the speaker independent system to an unknown user without any restrictions. The new speaker can use the system immediately for his purpose without knowing about the adaptation (no fixed texts). A supervision by the user is optional, but not obligatory. Two methods of adaptation are described: The first algorithm called SPONGE rapidly adapts the parameters of the used codebooks to the data of the new speaker. The second algorithm called ADDMIX re-trains the parameters of the SCHMM themselves (mixture coefficients only). Both methods can handle changes of speakers without any announcement by the users. The described algorithms are applicable to all speaker independent speech recognition systems using SCHMMs. The relative improvement of the recognition rate is about 10-20 %, depending on the unknown speaker. The algorithm SPONGE is compared to the well-known LVQl and LVQ2 by T. Kohonen and two other algorithms developed at our institute. The algorithm ADDMIX is compared with three other methods intending to sharpen the mixture coefficients within the states of the SCHMM.
Bibliographic reference. Schiel, Florian (1992): "Rapid non-supervised speaker adaptation of semicontinuous hidden Markov models", In ICSLP-1992, 1463-1466.