INTERSPEECH 2011

A new textindependent speaker identification (SI) system is proposed. This system utilizes the line spectral frequencies (LSFs) as alternative feature set for capturing the speaker characteristics. The boundary and ordering properties of the LSFs are considered and the LSF are transformed to the differential LSF (DLSF) space. Since the dynamic information is useful for speaker recognition, we represent the dynamic information of the DLSFs by considering two neighbors of the current frame, one from the past frames and the other from the following frames. The current frame with the neighbor frames together are cascaded into a supervector. The statistical distribution of this supervector is modelled by the socalled superDirichlet mixture model, which is an extension from the Dirichlet mixture model. Compared to the conventional SI system, which is using the melfrequency cepstral coefficients and based on the Gaussian mixture model, the proposed SI system shows a promising improvement.
Bibliographic reference. Ma, Zhanyu / Leijon, Arne (2011): "Superdirichlet mixture models using differential line spectral frequencies for textindependent speaker identification", In INTERSPEECH2011, 23492352.