Detecting and explaining person misfit in non-cognitive measurement

The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model, and estimated the PRF parameters from this multilevel framework. An advantage of the multilevel framework is that it allows relating person fit to explanatory variables for person misfit/fit. We critically discuss Reise’s (2000) approach. First, we argue that often the interpretation of the PRF slope as an indicator of person misfit is incorrect. Second, we show that the multilevel logistic regression model and the logistic PRF model are incompatible, resulting in a multilevel person-fit framework, which grossly violates the bivariate normality assumption for residuals in the multilevel model. Third, we use a Monte Carlo study to show that in the multilevel logistic regression framework estimates of distribution parameters of PRF intercepts and slopes are biased. Finally, we discuss the implications of these results and suggest an alternative multilevel regression approach to explanatory person-fit analysis. We illustrate the alternative approach using empirical data on repeated anxiety measurements of cardiac arrhythmia patients who had a cardioverter-defibrillator implanted.  This chapter was published as: Conijn, J. M., Emons, W. H. M., Van Assen, M. A. L. M, & Sijtsma, K. (2011). On the usefulness of a multilevel logistic regression approach to person-fit analysis. Multivariate Behavioral Research, 46, 365-388.

[1]  Edna B. Foa,et al.  The Validation of a Self-Report Measure of Posttraumatic Stress Disorder , 1997 .

[2]  David J. Whitney,et al.  Appropriateness Fit and Criterion-Related Validity , 1993 .

[3]  T. Pinsoneault A Variable Response Inconsistency Scale and a True Response Inconsistency Scale for the Jesness Inventory. , 1998 .

[4]  J K Wing,et al.  Health of the Nation Outcome Scales (HoNOS) , 1998, British Journal of Psychiatry.

[5]  Tom A. B. Snijders,et al.  Asymptotic null distribution of person fit statistics with estimated person parameter , 2001 .

[6]  Carol M Woods,et al.  Detection of Aberrant Responding on a Personality Scale in a Military Sample: an Application of Evaluating Person Fit with Two-level Logistic Regression Two-level Logistic Regression for Person Fit the Person Response Function , 2022 .

[7]  G. Andersson,et al.  Self-reported versus clinician-rated symptoms of depression as outcome measures in psychotherapy research on depression: a meta-analysis. , 2010, Clinical psychology review.

[8]  T. Tracey,et al.  The bilevel structure of the Outcome Questionnaire-45. , 2010, Psychological assessment.

[9]  Kit-Tai Hau,et al.  Goodness of fit in structural equation models , 2005 .

[10]  James N. Butcher,et al.  Assessment of Anger: The State-Trait Anger Scale , 2013 .

[11]  E. Hawkins,et al.  Measuring Outcome in Professional Practice: Considerations in Selecting and Using Brief Outcome Instruments. , 2004 .

[12]  Marla J. De Jong,et al.  Measurement of Anxiety for Patients With Cardiac Disease: A Critical Review and Analysis , 2006, The Journal of cardiovascular nursing.

[13]  Johan Denollet,et al.  DS14: Standard Assessment of Negative Affectivity, Social Inhibition, and Type D Personality , 2005, Psychosomatic medicine.

[14]  Marie Davidian,et al.  A Monte Carlo EM algorithm for generalized linear mixed models with flexible random effects distribution. , 2002, Biostatistics.

[15]  Lawrence M. Rudner,et al.  The Use of a Person-Fit Statistic with One High-Quality Achievement Test. , 1996 .

[16]  S. Reise,et al.  Using Multilevel Logistic Regression to Evaluate Person-Fit in IRT Models , 2000, Multivariate behavioral research.

[17]  Kikumi Tatasuoka Use of Generalized Person-Fit Indexes, Zetas for Statistical Pattern Classification. , 1996 .

[18]  R. Abelson Statistics As Principled Argument , 1995 .

[19]  F. Holloway Outcome measurement in mental health – welcome to the revolution , 2002, British Journal of Psychiatry.

[20]  Robert L. Linn,et al.  A Generalized Logistic Item Response Model Parameterizing Test Score Inappropriateness , 1987 .

[21]  Pere J. Ferrando,et al.  A Pearson-Type-VII item response model for assessing person fluctuation , 2007 .

[22]  Gordon W. Cheung,et al.  Assessing Extreme and Acquiescence Response Sets in Cross-Cultural Research Using Structural Equations Modeling , 2000 .

[23]  Carol M Woods,et al.  Monte Carlo Evaluation of Two-Level Logistic Regression for Assessing Person Fit , 2008, Multivariate behavioral research.

[24]  Pere J. Ferrando,et al.  Person Reliability in Personality Measurement: An Item Response Theory Analysis , 2004 .

[25]  George Engelhard,et al.  Using Item Response Theory and Model—Data Fit to Conceptualize Differential Item and Person Functioning for Students With Disabilities , 2009 .

[26]  Klaas Sijtsma,et al.  Methodology Review: Evaluating Person Fit , 2001 .

[27]  R. J. Mokken,et al.  Handbook of modern item response theory , 1997 .

[28]  Auke Tellegen,et al.  The Analysis of Consistency in Personality Assessment , 1988 .

[29]  Rob R. Meijer,et al.  A Comparison of the Person Response Function and the lz Person-Fit Statistic , 1998 .

[30]  John A. Johnson,et al.  The international personality item pool and the future of public-domain personality measures ☆ , 2006 .

[31]  R. Meijer,et al.  An evaluation of the Brief Symptom Inventory-18 using item response theory: which items are most strongly related to psychological distress? , 2011, Psychological assessment.

[32]  D. Clark,et al.  Reliability and Validity of the State-Trait Anxiety Inventory for Children in Adolescent Substance Abusers:: Confirmatory Factor Analysis and Item Response Theory , 1997 .

[33]  Klaas Sijtsma,et al.  Testing Hypotheses About the Person-Response Function in Person-Fit Analysis , 2004, Multivariate behavioral research.

[34]  Fritz Drasgow,et al.  Fitting Polytomous Item Response Theory Models to Multiple-Choice Tests , 1995 .

[35]  Robert E. Ployhart,et al.  Assessing the Convergent and Discriminant Validity of Goldberg’s International Personality Item Pool , 2006 .

[36]  R. MacCallum,et al.  Power analysis and determination of sample size for covariance structure modeling. , 1996 .

[37]  N. Christiansen,et al.  CORRECTING THE 16PF FOR FAKING: EFFECTS ON CRITERION-RELATED VALIDITY AND INDIVIDUAL HIRING DECISIONS , 1994 .

[38]  L. Jordaens,et al.  Increased Anxiety in Partners of Patients with a Cardioverter‐Defibrillator: The Role of Indication for ICD Therapy, Shocks, and Personality , 2009, Pacing and clinical electrophysiology : PACE.

[39]  I. Deary,et al.  Goldberg’s ‘IPIP’ Big-Five factor markers: Internal consistency and concurrent validation in Scotland , 2005 .

[40]  M. Slade What Outcomes to Measure in Routine Mental Health Services, and How to Assess Them: A Systematic Review , 2002, The Australian and New Zealand journal of psychiatry.

[41]  C. Glas,et al.  The Effect of Person Misfit on Classification Decisions , 2005 .

[42]  M. Salzer,et al.  Introduction Measuring Quality in Mental Health Services , 1997 .

[43]  C. Parsons,et al.  Application of Unidimensional Item Response Theory Models to Multidimensional Data , 1983 .

[44]  Rob R. Meijer,et al.  The Null Distribution of Person-Fit Statistics for Conventional and Adaptive Tests , 1999 .

[45]  W. Emons Nonparametric Person-Fit Analysis of Polytomous Item Scores , 2008 .

[46]  Correcting for Person Misfit in Aggregated Score Reporting , 2007 .

[47]  G. Proctor,et al.  Clinical assessment , 2014, BDJ.

[48]  Fritz Drasgow,et al.  Detecting Faking on a Personality Instrument Using Appropriateness Measurement , 1996 .

[49]  N. Schmitt,et al.  Correlates of Person Fit and Effect of Person Fit on Test Validity , 1999 .

[50]  M. Seidenberg,et al.  Cognitive deficits, psychopathology, and psychosocial functioning in bipolar mood disorder. , 1998, Neuropsychiatry, neuropsychology, and behavioral neurology.

[51]  K. Sullivan,et al.  Detecting faked psychopathology: A comparison of two tests to detect malingered psychopathology using a simulation design , 2010, Psychiatry Research.

[52]  M. Lambert,et al.  Collecting client feedback. , 2011, Psychotherapy.

[53]  S. Srivastava,et al.  The Big Five Trait taxonomy: History, measurement, and theoretical perspectives. , 1999 .

[54]  A. Wolf,et al.  Questioning the measurement precision of psychotherapy research , 2009, Psychotherapy research : journal of the Society for Psychotherapy Research.

[55]  S. Raudenbush,et al.  Maximum Likelihood for Generalized Linear Models with Nested Random Effects via High-Order, Multivariate Laplace Approximation , 2000 .

[56]  Elizabeth Broadbent,et al.  Explaining medically unexplained symptoms-models and mechanisms. , 2007, Clinical psychology review.

[57]  Sarity Dodson,et al.  Health and Quality of Life Outcomes , 2005 .

[58]  M. Barkham,et al.  Towards a standardised brief outcome measure: Psychometric properties and utility of the CORE–OM , 2002, British Journal of Psychiatry.

[59]  Anthony S. Bryk,et al.  Hierarchical Linear Models: Applications and Data Analysis Methods , 1992 .

[60]  E. Chico,et al.  Detecting Dissimulation in Personality Test Scores: A Comparison between Person-Fit Indices and Detection Scales , 2001 .

[61]  Steven P. Reise,et al.  Assessing Person-Fit on Measures of Typical Performance , 1996 .

[62]  J. Ormel,et al.  A validation study of the Hospital Anxiety and Depression Scale (HADS) in different groups of Dutch subjects , 1997, Psychological Medicine.

[63]  W. Heiser,et al.  The Outcome Questionnaire (OQ‐45) in a Dutch population: A cross‐cultural validation , 2007 .

[64]  Fritz Drasgow,et al.  Appropriateness Measurement for Some Multidimensional Test Batteries , 1991 .

[65]  Klaas Sijtsma,et al.  Detection and Validation of Unscalable Item Score Patterns Using Item Response Theory: An Illustration with Harter's Self-Perception Profile for Children , 2008, Journal of personality assessment.

[66]  S. Pedersen,et al.  Psychological Intervention Following Implantation of an Implantable Defibrillator: A Review and Future Recommendations , 2007, Pacing and clinical electrophysiology : PACE.

[67]  Michael L. Nering The Distribution of Person Fit Using True and Estimated Person Parameters , 1995 .

[68]  F. Zitman,et al.  Routine outcome monitoring in the Netherlands: practical experiences with a web-based strategy for the assessment of treatment outcome in clinical practice. , 2011, Clinical psychology & psychotherapy.

[69]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[70]  S. Phillips THE EFFECTS OF THE DELETION OF MISFITTING PERSONS ON VERTICAL EQUATING VIA THE RASCH MODEL , 1986 .

[71]  Jon A. Krosnick,et al.  Satisficing in surveys: Initial evidence , 1996 .

[72]  Peter Borkenau,et al.  Comparing exploratory and confirmatory factor analysis: A study on the 5-factor model of personality , 1990 .

[73]  D. M. Pedersen Acquiescence and Social Desirability Response Sets and Some Personality Correlates , 1967 .

[74]  P. J. Ferrando A graded response model for measuring person reliability. , 2009, The British journal of mathematical and statistical psychology.

[75]  S. Reise,et al.  How many IRT parameters does it take to model psychopathology items? , 2003, Psychological methods.

[76]  M. Thomas The Value of Item Response Theory in Clinical Assessment: A Review , 2011, Assessment.

[77]  M. Reckase Multidimensional Item Response Theory , 2009 .

[78]  George Karabatsos,et al.  Comparing the Aberrant Response Detection Performance of Thirty-Six Person-Fit Statistics , 2003 .

[79]  G. Burlingame,et al.  Construct Validity of the Outcome Questionnaire: A Confirmatory Factor Analysis , 1998 .

[80]  P. Ferrando,et al.  Some Statistics for Assessing Person-Fit Based on Continuous-Response Models , 2010 .

[81]  Jürgen Rost,et al.  Rasch Models in Latent Classes: An Integration of Two Approaches to Item Analysis , 1990 .

[82]  Carol M. Woods Careless Responding to Reverse-Worded Items: Implications for Confirmatory Factor Analysis , 2006 .

[83]  F. Samejima Graded Response Model , 1997 .

[84]  Jimmy de la Torre,et al.  Improving Person-Fit Assessment by Correcting the Ability Estimate and Its Reference Distribution. , 2008 .

[85]  S. Hathaway,et al.  MMPI-2 : Minnesota Multiphasic Personality Inventory-2 : manual for administration and scoring , 1989 .

[86]  T. Pinsoneault A Variable Response Inconsistency scale and a True Response Inconsistency scale for the Millon Adolescent Clinical Inventory. , 2002, Psychological assessment.

[87]  K. Schaie The course of adult intellectual development. , 1994, The American psychologist.

[88]  J. Brekke,et al.  Testing the Cross-Ethnic Construct Validity of the Brief Symptom Inventory , 2009 .

[89]  Harvey J Cohen,et al.  An Overview of Variance Inflation Factors for Sample-Size Calculation , 2003, Evaluation & the health professions.

[90]  Alberto Maydeu-Olivares,et al.  Estimation of IRT graded response models: limited versus full information methods. , 2009, Psychological methods.

[91]  David M. Lahuis,et al.  Investigating Faking Using a Multilevel Logistic Regression Approach to Measuring Person Fit , 2009 .

[92]  C. F. Kao,et al.  The efficient assessment of need for cognition. , 1984, Journal of personality assessment.

[93]  James Lumsden Tests are perfectly reliable , 1978 .

[94]  R. McCrae,et al.  On the invalidity of validity scales: evidence from self-reports and observer ratings in volunteer samples. , 2000, Journal of personality and social psychology.

[95]  P. Fayers Item Response Theory for Psychologists , 2004, Quality of Life Research.

[96]  D. Watson,et al.  Development and validation of brief measures of positive and negative affect: the PANAS scales. , 1988, Journal of personality and social psychology.

[97]  Jenn-Yun Tein,et al.  Longitudinal measurement models in evaluation research: Examining stability and change , 1996 .

[98]  P. Bentler,et al.  Cutoff criteria for fit indexes in covariance structure analysis : Conventional criteria versus new alternatives , 1999 .

[99]  Steven P Reise,et al.  Item response theory and clinical measurement. , 2009, Annual review of clinical psychology.

[100]  S. N. Beretvas,et al.  A Validation of the Factor Structure of OQ-45 Scores Using Factor Mixture Modeling , 2010 .

[101]  Kenneth A. Bollen,et al.  Overall Fit in Covariance Structure Models: Two Types of Sample Size Effects , 1990 .

[102]  Rob R. Meijer,et al.  Trait Level Estimation for Nonfitting Response Vectors , 1997 .

[103]  Stephen Olejnik,et al.  The Power of Rasch Person-Fit Statistics in Detecting Unusual Response Patterns , 1997 .

[104]  K. Sijtsma,et al.  Explanatory, Multilevel Person-Fit Analysis of Response Consistency on the Spielberger State-Trait Anxiety Inventory , 2013, Multivariate behavioral research.

[105]  Klaas Sijtsma,et al.  On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis , 2011, Multivariate behavioral research.

[106]  R. Petty,et al.  The need to evaluate. , 1996 .

[107]  Assessing inconsistent responding in E and N measures: An application of person-fit analysis in personality , 2012 .

[108]  L. Clark Schedule for Nonadaptive and Adaptive Personality (SNAP). , 1993 .

[109]  De Outcome Questionnaire: psychometrische kenmerken van de Nederlandse vertaling , 2004 .

[110]  Klaas Sijtsma,et al.  Global, local, and graphical person-fit analysis using person-response functions. , 2005, Psychological methods.

[111]  Steven P. Reise,et al.  Traitedness and the assessment of response pattern scalability , 1993 .

[112]  G. Huston The Hospital Anxiety and Depression Scale. , 1987, The Journal of rheumatology.

[113]  Ya-Fen Chan,et al.  Screening for atypical suicide risk with person fit statistics among people presenting to alcohol and other drug treatment. , 2010, Drug and alcohol dependence.

[114]  D. Betsy McCoach,et al.  The Performance of RMSEA in Models With Small Degrees of Freedom , 2015 .

[115]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[116]  Steven P. Reise,et al.  Scoring Method and the Detection of Person Misfit in a Personality Assessment Context , 1995 .

[117]  Herbert Hoijtink,et al.  The many null distributions of person fit indices , 1990 .

[118]  Fritz Drasgow,et al.  Detecting Inappropriate Test Scores with Optimal and Practical Appropriateness Indices , 1987 .

[119]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[120]  J. Lumsden,et al.  Person Reliability , 1977 .

[121]  R. Meijer,et al.  Analyzing psychopathology items: a case for nonparametric item response theory modeling. , 2004, Psychological methods.

[122]  R. Moineddin,et al.  A simulation study of sample size for multilevel logistic regression models , 2007, BMC medical research methodology.

[123]  C. Spielberger,et al.  Manual for the state-trait anxiety inventory (form Y) : "self-evaluation questionnaire" , 1983 .

[124]  Fritz Drasgow,et al.  Appropriateness measurement with polychotomous item response models and standardized indices , 1984 .

[125]  M. Trivedi,et al.  Systematic use of patient-rated depression severity monitoring: is it helpful and feasible in clinical psychiatry? , 2008, Psychiatric services.

[126]  P. Heagerty,et al.  Misspecified maximum likelihood estimates and generalised linear mixed models , 2001 .

[127]  T. Keith Multiple Regression and Beyond , 2005, Principles & Methods of Statistical Analysis.

[128]  A. Tellegen,et al.  Psychometric functioning of the MMPI-2-RF VRIN-r and TRIN-r scales with varying degrees of randomness, acquiescence, and counter-acquiescence. , 2010, Psychological assessment.

[129]  Dimitris Rizopoulos,et al.  ltm: An R Package for Latent Variable Modeling and Item Response Analysis , 2006 .

[130]  Klaas Sijtsma,et al.  The person response function as a tool in person-fit research , 2001 .

[131]  Steven P. Reise,et al.  The Influence of Test Characteristics on the Detection of Aberrant Response Patterns , 1991 .

[132]  R. Meijer,et al.  Diagnosing item score patterns on a test using item response theory-based person-fit statistics. , 2003, Psychological methods.

[133]  L. Jordaens,et al.  Risk of chronic anxiety in implantable defibrillator patients: a multi-center study. , 2011, International journal of cardiology.

[134]  Michael V. LeVine,et al.  Appropriateness measurement: Review, critique and validating studies , 1982 .

[135]  Roel Bosker,et al.  Multilevel analysis : an introduction to basic and advanced multilevel modeling , 1999 .