Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Separation of Speakers in Audio Data

Jesper O. Olsen

Center for PersonKommunikation, Aalborg University, Aalborg, Denmark

Speaker separation is a technique with potentially many applications, for instance as an aid in browsing audio documents. This paper describes a novel speaker separation method, where speaker models are created without having any training data available in advance. The method was tested on realistic unconstrained telephone conversations, and ergodic Hidden Markov Models used for speaker modelling. The overall results were sequence and duration accuracies of respectively 87% and 94%, when no prior knowledge of the speakers was used (i.e. training data). Keywords: Speaker Separation, Speaker Recognition, Hidden Markov Models.

Full Paper

Bibliographic reference.  Olsen, Jesper O. (1995): "Separation of speakers in audio data", In EUROSPEECH-1995, 355-358.