Rushing to Judgement: How do Laypeople Rate Caller Engagement in Thin-Slice Videos of Human–Machine Dialog?

Vikram Ramanarayanan, Chee Wee Leong, David Suendermann-Oeft


We analyze the efficacy of a small crowd of naïve human raters in rating engagement during human–machine dialog interactions. Each rater viewed multiple 10 second, thin-slice videos of non-native English speakers interacting with a computer-assisted language learning (CALL) system and rated how engaged and disengaged those callers were while interacting with the automated agent. We observe how the crowd’s ratings compared to callers’ self ratings of engagement, and further study how the distribution of these rating assignments vary as a function of whether the automated system or the caller was speaking. Finally, we discuss the potential applications and pitfalls of such a crowdsourced paradigm in designing, developing and analyzing engagement-aware dialog systems.


 DOI: 10.21437/Interspeech.2017-1205

Cite as: Ramanarayanan, V., Leong, C.W., Suendermann-Oeft, D. (2017) Rushing to Judgement: How do Laypeople Rate Caller Engagement in Thin-Slice Videos of Human–Machine Dialog?. Proc. Interspeech 2017, 2526-2530, DOI: 10.21437/Interspeech.2017-1205.


@inproceedings{Ramanarayanan2017,
  author={Vikram Ramanarayanan and Chee Wee Leong and David Suendermann-Oeft},
  title={ Rushing to Judgement: How do Laypeople Rate Caller Engagement in Thin-Slice Videos of Human–Machine Dialog?},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={2526--2530},
  doi={10.21437/Interspeech.2017-1205},
  url={http://dx.doi.org/10.21437/Interspeech.2017-1205}
}