ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Affective classification of generic audio clips using regression models

Nikolaos Malandrakis, Shiva Sundaram, Alexandros Potamianos

We investigate acoustic modeling, feature extraction and feature selection for the problem of affective content recognition of generic, non-speech, non-music sounds. We annotate and analyze a database of generic sounds containing a subset of the BBC sound effects library. We use regression models, long-term features and wrapper-based feature selection to model affect in the continuous 3-D (arousal, valence, dominance) emotional space. The frame-level features for modeling are extracted from each audio clip and combined with functionals to estimate long term temporal patterns over the duration of the clip. Experimental results show that the regression models provide similar categorical performance as the more popular Gaussian Mixture Models. They are also capable of predicting accurate affective ratings on continuous scales, achieving 62.67% 3-class accuracy and correlation with human ratings, higher than comparable numbers in literature.

doi: 10.21437/Interspeech.2013-245

Cite as: Malandrakis, N., Sundaram, S., Potamianos, A. (2013) Affective classification of generic audio clips using regression models. Proc. Interspeech 2013, 2832-2836, doi: 10.21437/Interspeech.2013-245

  author={Nikolaos Malandrakis and Shiva Sundaram and Alexandros Potamianos},
  title={{Affective classification of generic audio clips using regression models}},
  booktitle={Proc. Interspeech 2013},