Automatic Scoring of Monologue Video Interviews Using Multimodal Cues

Lei Chen, Gary Feng, Michelle Martin-Raugh, Chee Wee Leong, Christopher Kitchen, Su-Youn Yoon, Blair Lehman, Harrison Kell, Chong Min Lee

Job interviews are an important tool for employee selection. When making hiring decisions, a variety of information from interviewees, such as previous work experience, skills, and their verbal and nonverbal communication, are jointly considered. In recent years, Social Signal Processing (SSP), an emerging research area on enabling computers to sense and understand human social signals, is being used develop systems for the coaching and evaluation of job interview performance. However this research area is still in its infancy and lacks essential resources (e.g., adequate corpora). In this paper, we report on our efforts to create an automatic interview rating system for monologue-style video interviews, which have been widely used in today’s job hiring market. We created the first multimodal corpus for such video interviews. Additionally, we conducted manual rating on the interviewee’s personality and performance during 12 structured interview questions measuring different types of job-related skills. Finally, focusing on predicting overall interview performance, we explored a set of verbal and nonverbal features and several machine learning models. We found that using both verbal and nonverbal features provides more accurate predictions. Our initial results suggest that it is feasible to continue working in this newly formed area.

DOI: 10.21437/Interspeech.2016-1453

Cite as

Chen, L., Feng, G., Martin-Raugh, M., Leong, C.W., Kitchen, C., Yoon, S., Lehman, B., Kell, H., Lee, C.M. (2016) Automatic Scoring of Monologue Video Interviews Using Multimodal Cues. Proc. Interspeech 2016, 32-36.

author={Lei Chen and Gary Feng and Michelle Martin-Raugh and Chee Wee Leong and Christopher Kitchen and Su-Youn Yoon and Blair Lehman and Harrison Kell and Chong Min Lee},
title={Automatic Scoring of Monologue Video Interviews Using Multimodal Cues},
booktitle={Interspeech 2016},