Automatic emotion recognition and computational paralinguistics have matured to some robustness under controlled laboratory settings, however, the accuracies are degraded in real-life conditions such as the presence of noise and reverberation. In this paper we take a look at the relevance of acoustic features for expression of valence, arousal, and interest conveyed by a speaker's voice. Experiments are conducted on the GEMEP and TUM AVIC databases. To simulate realistically degraded conditions the audio is corrupted with real room impulse responses and real-life noise recordings. Features well correlated with the target (emotion) over a wide range of acoustic conditions are analysed and an interpretation is given. Classification results in matched and mismatched settings with multi-condition training are provided to validate the benefit of the feature selection method. Our proposed way of selecting features over a range of noise types considerably boosts the generalisation ability of the classifiers.
Bibliographic reference. Eyben, Florian / Weninger, Felix / Schuller, Björn (2013): "Affect recognition in real-life acoustic conditions — a new perspective on feature selection", In INTERSPEECH-2013, 2044-2048.