Odyssey 2012 - The Speaker and Language Recognition Workshop

June 25-28, 2012

Preliminary Investigation of Boltzmann Machine Classifiers for Speaker Recognition

Themos Stafylakis, Patrick Kenny, Mohammed Senoussaoui, Pierre Dumouchel

Centre de recherche informatique de Montréal (CRIM) and
École de technologie supérieure (ETS), Montréal, Canada

We propose a novel generative approach to speaker recognition using Boltzmann machines, a fledgeling non-Gaussian probabilistic framework that is increasingly gaining attention in several machine learning fields. We show how a modified i-vector representation of speech utterances enables the development of several Boltzmann machine architectures for speaker verification and we report some preliminary speaker recognition results obtained with one of them, which we refer to as Siamese twins. The Siamese twin architecture is designed to capture correlations between utterances spoken by a single speaker and it can be regarded as probabilistic analogue of the well known cosine distance metric. A relative improvement of 27% is reported on NIST-2010 telephone female data.

Full Paper

Bibliographic reference.  Stafylakis, Themos / Kenny, Patrick / Senoussaoui, Mohammed / Dumouchel, Pierre (2012): "Preliminary investigation of Boltzmann machine classifiers for speaker recognition", In Odyssey-2012, 109-116.