INTERSPEECH 2012

This papers presents results on the application of restricted Boltzmann machines (RBM) and deep belief networks (DBN) on the Likability Sub Challenge of the Interspeech 2012 Speaker Trait Challenge. RBMs are a particular form of loglinear Markov Random Fields and generative models which try to model the probability distribution of the underlying input data which can be trained in an unsupervised fashion. DBNs can be constructed by stacking RBMs and are known to yield an increasingly complex representation of the input data as the number of layers increases. Our results show that the Likability SubChallenge classification task does not benefit from the modeling power of DBN, but that the use of an RBM as the first stage of a twolayer neural network with subsequent finetuning improves the baseline result of 59.0% to 64.0%, i.e. a relative 8.5% improvement of the unweighted average evaluation measure.
Index Terms: Likability, speaker trait challenge, restricted Boltzmann machines, deep belief networks
Bibliographic reference. Brueckner, Raymond / Schuller, Björn (2012): "Likability classification  a not so deep neural network approach", In INTERSPEECH2012, 290293.