The Effect of Silence Feature in Dimensional Speech Emotion Recognition

Bagus Tris Atmaja, Masato Akagi


Silence is a part of human-to-human communication, which can be a clue for human emotion perception. For automatic emotion recognition by a computer, it is not clear whether silence is useful to determine human emotion within a speech. This paper presents the investigation of the effect of using silence feature in dimensional emotion recognition. As the silence feature is extracted per utterance, we grouped the silence feature with high statistical functions from a set of acoustic features. The result reveals that the silence feature affects the arousal dimension more than other emotion dimensions. The proper choice of a factor in the calculation of silence feature improves the performance of dimensional speech emotion recognition performance in terms of a concordance correlation coefficient. On the other side, improper choice of that factor leads to a decrease in performance by using the same architecture.


 DOI: 10.21437/SpeechProsody.2020-6

Cite as: Atmaja, B.T., Akagi, M. (2020) The Effect of Silence Feature in Dimensional Speech Emotion Recognition. Proc. 10th International Conference on Speech Prosody 2020, 26-30, DOI: 10.21437/SpeechProsody.2020-6.


@inproceedings{Atmaja2020,
  author={Bagus Tris Atmaja and Masato Akagi},
  title={{The Effect of Silence Feature in Dimensional Speech Emotion Recognition}},
  year=2020,
  booktitle={Proc. 10th International Conference on Speech Prosody 2020},
  pages={26--30},
  doi={10.21437/SpeechProsody.2020-6},
  url={http://dx.doi.org/10.21437/SpeechProsody.2020-6}
}