Automatic Scoring of Monologue Video Interviews Using Multimodal Cues

Job interviews are an important tool for employee selection. When making hiring decisions, a variety of information from interviewees, such as previous work experience, skills, and their verbal and nonverbal communication, are jointly considered. In recent years, Social Signal Processing (SSP), an emerging research area on enabling computers to sense and understand human social signals, is being used develop systems for the coaching and evaluation of job interview performance. However this research area is still in its infancy and lacks essential resources (e.g., adequate corpora). In this paper, we report on our efforts to create an automatic interview rating system for monologuestyle video interviews, which have been widely used in today’s job hiring market. We created the first multimodal corpus for such video interviews. Additionally, we conducted manual rating on the interviewee’s personality and performance during 12 structured interview questions measuring different types of jobrelated skills. Finally, focusing on predicting overall interview performance, we explored a set of verbal and nonverbal features and several machine learning models. We found that using both verbal and nonverbal features provides more accurate predictions. Our initial results suggest that it is feasible to continue working in this newly formed area.

[1]  Allen I. Huffcutt,et al.  Identification and meta-analytic assessment of psychological constructs measured in employment interviews. , 2001, The Journal of applied psychology.

[2]  Jian Cheng,et al.  Validating automated speaking tests , 2010 .

[3]  Steven F. Cronshaw,et al.  A meta‐analytic investigation of the impact of interview format and degree of structure on the validity of the employment interview* , 1988 .

[4]  Louis-Philippe Morency,et al.  Audiovisual behavior descriptors for depression assessment , 2013, ICMI '13.

[5]  Lei Chen,et al.  Towards Automated Assessment of Public Speaking Skills Using Multimodal Cues , 2014, ICMI.

[6]  James W. Pennebaker,et al.  Linguistic Inquiry and Word Count (LIWC2007) , 2007 .

[7]  Dirk Heylen,et al.  Bridging the Gap between Social Animal and Unsocial Machine: A Survey of Social Signal Processing , 2012, IEEE Transactions on Affective Computing.

[8]  Xiaoming Xi,et al.  Improved pronunciation features for construct-driven assessment of non-native spontaneous speech , 2009, HLT-NAACL.

[9]  Daniel Gatica-Perez Signal Processing in the Workplace [Social Sciences] , 2015, IEEE Signal Processing Magazine.

[10]  W. D. Johnson,et al.  Intraclass Correlation Coefficient , 2006, International Encyclopedia of Statistical Science.

[11]  Silke M. Witt,et al.  Use of speech recognition in computer-assisted language learning , 2000 .

[12]  Daniel Gatica-Perez,et al.  Hire me: Computational Inference of Hirability in Employment Interviews Based on Nonverbal Behavior , 2014, IEEE Transactions on Multimedia.

[13]  A. Imada,et al.  Influence of nonverbal communication and rater proximity on impressions and decisions in simulated employment interviews. , 1977 .

[14]  Antonio Camurri,et al.  Toward a Minimal Representation of Affective Gestures , 2011, IEEE Transactions on Affective Computing.

[15]  P. Jackson,et al.  Non‐verbal behaviour and the outcome of selection interviews , 1980 .

[16]  Annemarie M. F. Hiemstra,et al.  Video résumés portrayed: Findings and challenges , 2015 .

[17]  Daniel Gildea,et al.  Automated prediction and analysis of job interview performance: The role of what you say and how you say it , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[18]  R. Liden,et al.  Interviewer perceptions of applicant qualifications: A multivariate field study of demographic characteristics and nonverbal cues. , 1984 .

[19]  Xiaoming Xi,et al.  Automatic scoring of non-native spontaneous speech in tests of spoken English , 2009, Speech Commun..

[20]  J. Oostrom,et al.  Employee Recruitment, Selection, and Assessment : Contemporary Issues for Theory and Practice , 2015 .

[21]  R. Goffin,et al.  Comparing the validity of structured interviews for managerial-level employees: Should we look to the past or focus on the future? , 2006 .

[22]  Murray R. Barrick,et al.  ACCURACY OF INTERVIEWER JUDGMENTS OF JOB APPLICANT PERSONALITY TRAITS , 2000 .

[23]  T. DeGroot,et al.  Can Nonverbal Cues be Used to Make Meaningful Personality Attributions in Employment Interviews? , 2009 .

[24]  Kristin Precoda,et al.  EduSpeak®: A speech recognition and pronunciation scoring toolkit for computer-aided language learning applications , 2010 .

[25]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[26]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .