Initial investigation into computer scoring of candidate essays for personnel selection.

[Correction Notice: An Erratum for this article was reported in Vol 101(7) of Journal of Applied Psychology (see record 2016-32115-001). In the article the affiliations for Emily D. Campion and Matthew H. Reider were originally incorrect. All versions of this article have been corrected.] Emerging advancements including the exponentially growing availability of computer-collected data and increasingly sophisticated statistical software have led to a "Big Data Movement" wherein organizations have begun attempting to use large-scale data analysis to improve their effectiveness. Yet, little is known regarding how organizations can leverage these advancements to develop more effective personnel selection procedures, especially when the data are unstructured (text-based). Drawing on literature on natural language processing, we critically examine the possibility of leveraging advances in text mining and predictive modeling computer software programs as a surrogate for human raters in a selection context. We explain how to "train" a computer program to emulate a human rater when scoring accomplishment records. We then examine the reliability of the computer's scores, provide preliminary evidence of their construct validity, demonstrate that this practice does not produce scores that disadvantage minority groups, illustrate the positive financial impact of adopting this practice in an organization (N ∼ 46,000 candidates), and discuss implementation issues. Finally, we discuss the potential implications of using computer scoring to address the adverse impact-validity dilemma. We suggest that it may provide a cost-effective means of using predictors that have comparable validity but have previously been too expensive for large-scale screening. (PsycINFO Database Record

[1]  David M. Williamson,et al.  Automated essay scoring: Psychometric guidelines and practices , 2013 .

[2]  Hugo Liu,et al.  ConceptNet — A Practical Commonsense Reasoning Tool-Kit , 2004 .

[3]  Peter W. Foltz,et al.  Automated Essay Scoring: Applications to Educational Technology , 1999 .

[4]  Robert E. Ployhart,et al.  Determinants, Detection and Amelioration of Adverse Impact in Personnel Selection Procedures: Issues, Evidence and Lessons Learned , 2001 .

[5]  Donald M. Truxillo,et al.  Applicant Reactions to Different Selection Technology: Face-to-Face, Interactive Voice Response, and Computer-Assisted Telephone Screening Interviews , 2004 .

[6]  Lawrence M. Rudner,et al.  Automated Essay Scoring Using Bayes' Theorem , 2002 .

[7]  Theodore W. Frick Computerized Adaptive Mastery Tests as Expert Systems , 1992 .

[8]  Henry Lieberman,et al.  A model of textual affect sensing using real-world knowledge , 2003, IUI '03.

[9]  B. Schneider,et al.  Employee Engagement: Tools for Analysis, Practice, and Competitive Advantage , 2009 .

[10]  Michael A. McDaniel,et al.  A META‐ANALYSIS OF THE VALIDITY OF METHODS FOR RATING TRAINING AND EXPERIENCE IN PERSONNEL SELECTION , 1988 .

[11]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .

[12]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[13]  David A. Kravitz,et al.  THE DIVERSITY–VALIDITY DILEMMA: OVERVIEW AND LEGAL CONTEXT , 2008 .

[14]  Hugo Liu,et al.  What would they think?: a computational model of attitudes , 2004, IUI '04.

[15]  Salvatore Valenti,et al.  An Overview of Current Research on Automated Essay Grading , 2003, J. Inf. Technol. Educ..

[16]  Yorick Wilks,et al.  Information Extraction: Beyond Document Retrieval , 1998, Int. J. Comput. Linguistics Chin. Lang. Process..

[17]  HofmannThomas Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2001 .

[18]  Peter W. Foltz,et al.  Latent semantic analysis for text-based research , 1996 .

[19]  Anat Ben-Simon,et al.  Toward More Substantively Meaningful Automated Essay Scoring , 2007 .

[20]  Jill Burstein,et al.  AUTOMATED ESSAY SCORING WITH E‐RATER® V.2.0 , 2004 .

[21]  Klaus Zechner,et al.  Automated Essay Scoring: Writing Assessment and Instruction , 2010 .

[22]  S. Dumais Latent Semantic Analysis. , 2005 .

[23]  Martin Chodorow,et al.  C-rater: Automated Scoring of Short-Answer Questions , 2003, Comput. Humanit..

[24]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[25]  Leaetta M. Hough,et al.  Development and evaluation of the «accomplishment record» method of selecting and promoting professionals , 1984 .

[26]  T. Landauer Automatic Essay Assessment , 2003 .

[27]  Peter Wiemer-Hastings,et al.  Latent semantic analysis , 2004, Annu. Rev. Inf. Sci. Technol..

[28]  Martin Chodorow,et al.  Stumping e-rater: challenging the validity of automated essay scoring , 2002, Comput. Hum. Behav..

[29]  Lawrence M. Rudner,et al.  An Evaluation of IntelliMetric ™ Essay Scoring System Using Responses to GMAT ® AWA Prompts , 2006 .

[30]  Bryan D. Edwards,et al.  An examination of factors contributing to a reduction in subgroup differences on a constructed-response paper-and-pencil test of scholastic achievement. , 2007, The Journal of applied psychology.

[31]  Gabriel Recchia,et al.  More data trumps smarter algorithms: Comparing pointwise mutual information with latent semantic analysis , 2009, Behavior research methods.

[32]  D. Campbell,et al.  Convergent and discriminant validation by the multitrait-multimethod matrix. , 1959, Psychological bulletin.

[33]  Robert E. Ployhart,et al.  THE DIVERSITY–VALIDITY DILEMMA: STRATEGIES FOR REDUCING RACIOETHNIC AND SEX SUBGROUP DIFFERENCES AND ADVERSE IMPACT IN SELECTION , 2008 .

[34]  Chad W. Buckendahl,et al.  A Review of Strategies for Validating Computer-Automated Scoring , 2002 .

[35]  Winfred Arthur,et al.  Hunter and Hunter (1984) revisited: Interview validity for entry-level jobs. , 1994 .

[36]  Barbara J. Grosz,et al.  Natural-Language Processing , 1982, Artificial Intelligence.

[37]  Steven D. Maurer,et al.  The validity of employment interviews: A comprehensive review and meta-analysis. , 1994 .

[38]  Gobinda G. Chowdhury,et al.  Natural language processing , 2005, Annu. Rev. Inf. Sci. Technol..

[39]  Semire Dikli,et al.  An Overview of Automated Scoring of Essays. , 2006 .

[40]  Paul Deane,et al.  On the relation between automated essay scoring and modern views of the writing construct , 2013 .

[41]  Y. Attali,et al.  Scoring with the computer: Alternative procedures for improving the reliability of holistic essay scoring , 2013 .

[42]  S Padmaja,et al.  Opinion Mining and Sentiment Analysis - An Assessment of Peoples' Belief: A Survey , 2013 .

[43]  Juan M. Madera,et al.  The Diversity-Validity Dilemma , 2012 .

[44]  Brent Bridgeman,et al.  Performance of a Generic Approach in Automated Essay Scoring , 2010 .

[45]  James M. LeBreton,et al.  Answers to 20 Questions About Interrater Reliability and Interrater Agreement , 2008 .

[46]  Michael A. Campion,et al.  ARE HIGHLY STRUCTURED JOB INTERVIEWS RESISTANT TO DEMOGRAPHIC SIMILARITY EFFECTS , 2010 .

[47]  Martin Chodorow,et al.  COMPARING THE VALIDITY OF AUTOMATED AND HUMAN ESSAY SCORING , 2000 .

[48]  Bryan D. Edwards,et al.  MULTIPLE‐CHOICE AND CONSTRUCTED RESPONSE TESTS OF ABILITY: RACE‐BASED SUBGROUP PERFORMANCE DIFFERENCES ON ALTERNATIVE PAPER‐AND‐PENCIL TEST FORMATS , 2002 .