Emotion Profile Refinery for Speech Emotion Classification

Shuiyang Mao, P.C. Ching, Tan Lee


Human emotions are inherently ambiguous and impure. When designing systems to anticipate human emotions based on speech, the lack of emotional purity must be considered. However, most of the current methods for speech emotion classification rest on the consensus, e. g., one single hard label for an utterance. This labeling principle imposes challenges for system performance considering emotional impurity. In this paper, we recommend the use of emotional profiles (EPs), which provides a time series of segment-level soft labels to capture the subtle blends of emotional cues present across a specific speech utterance. We further propose the emotion profile refinery (EPR), an iterative procedure to update EPs. The EPR method produces soft, dynamically-generated, multiple probabilistic class labels during successive stages of refinement, which results in significant improvements in the model accuracy. Experiments on three well-known emotion corpora show noticeable gain using the proposed method.


 DOI: 10.21437/Interspeech.2020-1771

Cite as: Mao, S., Ching, P., Lee, T. (2020) Emotion Profile Refinery for Speech Emotion Classification. Proc. Interspeech 2020, 531-535, DOI: 10.21437/Interspeech.2020-1771.


@inproceedings{Mao2020,
  author={Shuiyang Mao and P.C. Ching and Tan Lee},
  title={{Emotion Profile Refinery for Speech Emotion Classification}},
  year=2020,
  booktitle={Proc. Interspeech 2020},
  pages={531--535},
  doi={10.21437/Interspeech.2020-1771},
  url={http://dx.doi.org/10.21437/Interspeech.2020-1771}
}