Automated scoring of interview videos using Doc2Vec multimodal feature extraction paradigm

As the popularity of video-based job interviews rises, so does the need for automated tools to evaluate interview performance. Real world hiring decisions are based on assessments of knowledge and skills as well as holistic judgments of person-job fit. While previous research on automated scoring of interview videos shows promise, it lacks coverage of monologue-style responses to structured interview (SI) questions and content-focused interview rating. We report the development of a standardized video interview protocol as well as human rating rubrics focusing on verbal content, personality, and holistic judgment. A novel feature extraction method using ``visual words" automatically learned from video analysis outputs and the Doc2Vec paradigm is proposed. Our promising experimental results suggest that this novel method provides effective representations for the automated scoring of interview videos.

[1]  M. Born,et al.  Webcam testing: Validation of an innovative open-ended multimedia test , 2010 .

[2]  Xiaoming Xi,et al.  Automatic scoring of non-native spontaneous speech in tests of spoken English , 2009, Speech Commun..

[3]  Silke M. Witt,et al.  Use of speech recognition in computer-assisted language learning , 2000 .

[4]  J. Oostrom,et al.  Employee Recruitment, Selection, and Assessment : Contemporary Issues for Theory and Practice , 2015 .

[5]  R. Goffin,et al.  Comparing the validity of structured interviews for managerial-level employees: Should we look to the past or focus on the future? , 2006 .

[6]  Mohammed E. Hoque My automated conversation helper (MACH): helping people improve social skills. , 2012, ICMI '12.

[7]  Lei Chen,et al.  Applying Rhythm Features to Automatically Assess Non-Native Speech , 2011, INTERSPEECH.

[8]  Annemarie M. F. Hiemstra,et al.  Video résumés portrayed: Findings and challenges , 2015 .

[9]  Rosalind W. Picard,et al.  Rich Nonverbal Sensing Technology for Automated Social Skills Training , 2014, Computer.

[10]  James W. Pennebaker,et al.  Linguistic Inquiry and Word Count (LIWC2007) , 2007 .

[11]  Su-Youn Yoon,et al.  Automatic Scoring of Monologue Video Interviews Using Multimodal Cues , 2016, INTERSPEECH.

[12]  P. Jackson,et al.  Non‐verbal behaviour and the outcome of selection interviews , 1980 .

[13]  Sudhir Tandon,et al.  Fifty Years and Going Strong: What Makes Behaviorally Anchored Rating Scales So Perennial as an Appraisal Method? , 2015 .

[14]  Marjorie L. Icenogle,et al.  A Managerial Perspective: Oral Communi cation Competency Is Most Important for Business Students in the Workplace Jeanne D. Maes , 1997 .

[15]  Gary G. Koch,et al.  Intraclass Correlation Coefficient , 2011, International Encyclopedia of Statistical Science.

[16]  Louis-Philippe Morency,et al.  Combining Two Perspectives on Classifying Multimodal Data for Recognizing Speaker Traits , 2015, ICMI.

[17]  Steven F. Cronshaw,et al.  A meta‐analytic investigation of the impact of interview format and degree of structure on the validity of the employment interview* , 1988 .

[18]  Murray R. Barrick,et al.  ACCURACY OF INTERVIEWER JUDGMENTS OF JOB APPLICANT PERSONALITY TRAITS , 2000 .

[19]  Lei Chen,et al.  Towards Automated Assessment of Public Speaking Skills Using Multimodal Cues , 2014, ICMI.

[20]  Louis-Philippe Morency,et al.  Audiovisual behavior descriptors for depression assessment , 2013, ICMI '13.

[21]  Daniel Gatica-Perez Signal Processing in the Workplace [Social Sciences] , 2015, IEEE Signal Processing Magazine.

[22]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[23]  A. Imada,et al.  Influence of nonverbal communication and rater proximity on impressions and decisions in simulated employment interviews. , 1977 .

[24]  Daniel Gildea,et al.  Automated prediction and analysis of job interview performance: The role of what you say and how you say it , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).

[25]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[26]  Daniel Gatica-Perez,et al.  Hire me: Computational Inference of Hirability in Employment Interviews Based on Nonverbal Behavior , 2014, IEEE Transactions on Multimedia.

[27]  P. Wright,et al.  The structured interview: Additional studies and a meta‐analysis , 1989 .

[28]  Patrick Gebhard,et al.  A Job Interview Simulation: Social Cue-Based Interaction with a Virtual Character , 2013, 2013 International Conference on Social Computing.

[29]  Allen I. Huffcutt,et al.  Identification and meta-analytic assessment of psychological constructs measured in employment interviews. , 2001, The Journal of applied psychology.

[30]  Elia Bruni,et al.  Distributional semantics from text and images , 2011, GEMS.

[31]  Jian Cheng,et al.  Performance of Automated Scoring for Children’s Oral Reading , 2011, BEA@ACL.

[32]  M. Born,et al.  A Multimedia Situational Test With a Constructed-Response Format Its Relationship With Personality, Cognitive Ability, Job Experience, and Academic Performance , 2011 .

[33]  T. DeGroot,et al.  Can Nonverbal Cues be Used to Make Meaningful Personality Attributions in Employment Interviews? , 2009 .

[34]  J. Pennebaker,et al.  The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods , 2010 .

[35]  Dirk Heylen,et al.  Bridging the Gap between Social Animal and Unsocial Machine: A Survey of Social Signal Processing , 2012, IEEE Transactions on Affective Computing.

[36]  Xiaoming Xi,et al.  Improved pronunciation features for construct-driven assessment of non-native spontaneous speech , 2009, HLT-NAACL.

[37]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[38]  R. Liden,et al.  Interviewer perceptions of applicant qualifications: A multivariate field study of demographic characteristics and nonverbal cues. , 1984 .

[39]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[40]  Birk Diedenhofen,et al.  cocor: A Comprehensive Solution for the Statistical Comparison of Correlations , 2015, PloS one.

[41]  P. C. Smith,et al.  Retranslation of expectations: An approach to the construction of unambiguous anchors for rating scales. , 1963 .

[42]  A. Wagers,et al.  QUANTIFYING SPECTRAL FEATURES OF TYPE Ia SUPERNOVAE , 2009, 0907.3171.

[43]  Max Kuhn,et al.  Building Predictive Models in R Using the caret Package , 2008 .