13th Annual Conference of the International Speech Communication Association

Portland, OR, USA
September 9-13, 2012

Likability Classification - A Not so Deep Neural Network Approach

Raymond Brueckner (1,2), Björn Schuller (2)

(1) Institute for Human-Machine Communication, Technische Universität München, Germany
(2) Nuance Communications Inc., Aachen, Germany

This papers presents results on the application of restricted Boltzmann machines (RBM) and deep belief networks (DBN) on the Likability Sub- Challenge of the Interspeech 2012 Speaker Trait Challenge. RBMs are a particular form of log-linear Markov Random Fields and generative models which try to model the probability distribution of the underlying input data which can be trained in an unsupervised fashion. DBNs can be constructed by stacking RBMs and are known to yield an increasingly complex representation of the input data as the number of layers increases. Our results show that the Likability Sub-Challenge classification task does not benefit from the modeling power of DBN, but that the use of an RBM as the first stage of a two-layer neural network with subsequent fine-tuning improves the baseline result of 59.0% to 64.0%, i.e. a relative 8.5% improvement of the unweighted average evaluation measure.

Index Terms: Likability, speaker trait challenge, restricted Boltzmann machines, deep belief networks

Full Paper

Bibliographic reference.  Brueckner, Raymond / Schuller, Björn (2012): "Likability classification - a not so deep neural network approach", In INTERSPEECH-2012, 290-293.