3 Validity: Foundational Issues and Statistical Methodology

Publisher Summary This chapter highlights some foundational and statistical issues involved in validity theory and validation practice. It discusses several foundational issues focusing on several observations about the current state of affairs in validity theory and practice, introducing a new framework for considering the bounds and limitations of the measurement inferences. It also discusses the distinction between measures and indices. The chapter deals with two statistical methods—variable ordering and latent variable regression—and introduces a methodology for variable-ordering in latent variable regression models in validity research. Measurement or test score validation is an ongoing process wherein an evidence to support the appropriateness, meaningfulness, and usefulness of the specific inferences made from scores about individuals from a given sample and in a given context is provided. The concept, method, and processes of validation are central to constructing and evaluating measures used in the social, behavioral, health, and human sciences because without validation any inferences made from a measure are potentially meaningless, inappropriate, and of limited usefulness.

[1]  D. Dey,et al.  To Bayes or Not to Bayes, From Whether to When: Applications of Bayesian Methodology to Modeling , 2004 .

[2]  Irene R. R. Lu,et al.  Embedding IRT in Structural Equation Models: A Comparison With Regression Based on IRT Scores , 2005 .

[3]  D. R. Thomas,et al.  On Variable Importance in Linear Regression , 1998 .

[4]  B. Plake,et al.  A Historical Comparison of Validity Standards and Validity Practices , 1998 .

[5]  David Draper,et al.  Inference and Hierarchical Modeling in the Social Sciences , 1995 .

[6]  Ronald K. Hambleton,et al.  Encyclopedia of psychological assessment , 2002 .

[7]  A. Rupp,et al.  Responsible Modeling of Measurement Data for Appropriate Inferences: Important Advances in Reliability and Validity Theory , 2004 .

[8]  Bruno D. Zumbo,et al.  Using a Measure of Variable Importance to Investigate the Standardization of Discriminant Coefficients , 1996 .

[9]  Robert J. Mislevy,et al.  Test Theory Reconceived , 1996 .

[10]  David Kaplan,et al.  The Sage handbook of quantitative methodology for the social sciences , 2004 .

[11]  A. Satorra,et al.  Complex Sample Data in Structural Equation Modeling , 1995 .

[12]  S. Sireci The Construct of Content Validity , 1998 .

[13]  W. Shadish,et al.  Experimental and Quasi-Experimental Designs for Generalized Causal Inference , 2001 .

[14]  A. Goldberger,et al.  Estimation of a Model with Multiple Indicators and Multiple Causes of a Single Latent Variable , 1975 .

[15]  B. Zumbo,et al.  A Dialectic on Validity: Where We Have Been and Where We Are Going , 1996 .

[16]  David Lindley,et al.  Bayesian Statistics, a Review , 1987 .

[17]  Persistent Methodological Questions in Educational Testing , 1999 .

[18]  R. Linn Educational measurement, 3rd ed. , 1989 .

[19]  S. Messick Validity of Psychological Assessment: Validation of Inferences from Persons' Responses and Performances as Scientific Inquiry into Score Meaning. Research Report RR-94-45. , 1994 .

[20]  H. Gulliksen Theory of mental tests , 1952 .

[21]  Bengt Muthén,et al.  A Method for Studying the Homogeneity of Test Items with Respect to Other Relevant Variables , 1985 .

[22]  B. Zumbo Opening Remarks to the Special Issue on Validity Theory and the Methods Used in Validation: Perspectives from the Social and Behavioral Sciences , 1998 .

[23]  A. Rupp,et al.  Which Model is Best? Robustness Properties to Justify Model Choice Among Unidimensional IRT Models under Item Parameter Drift , 2003 .

[24]  A Matter of Test Bias in Educational Policy Research: Bringing the Context into Picture by Investigating Sociological/Community Moderated (or Mediated) Test and Item Bias. , 2005 .

[25]  D. Freedman As Others See Us: A Case Study in Path Analysis , 1987 .

[26]  Micheline Chalhoub-Deville,et al.  Issues in Computer-Adaptive Testing of Reading Proficiency , 2000 .

[27]  S. Messick Test Validity: A Matter of Consequence , 1998 .

[28]  André A. Rupp,et al.  Understanding Parameter Invariance in Unidimensional IRT Models , 2006 .

[29]  A. Rupp,et al.  A Note on How to Quantify and Report Whether Irt Parameter Invariance Holds: When Pearson Correlations are Not Enough , 2004 .

[30]  Russell G. Almond,et al.  A cognitive task analysis with implications for designing simulation-based performance assessment☆ , 1999 .

[31]  Benjamin D. Wright,et al.  A History of Social Science Measurement , 2005 .

[32]  R. Lennox,et al.  Conventional wisdom on measurement: A structural equation perspective. , 1991 .

[33]  K. Bollen,et al.  A tetrad test for causal indicators. , 2000, Psychological methods.

[34]  Neil Salkind Encyclopedia of Measurement and Statistics , 2006 .

[35]  R. Hambleton,et al.  Adapting Tests for Use in Multiple Languages and Cultures , 1998 .

[36]  Tihomir Asparouhov,et al.  Sampling Weights in Latent Variable Modeling , 2005 .

[37]  M. Kane Current Concerns in Validity Theory , 2001 .

[38]  K. Jöreskog,et al.  Factor Models for Ordinal Variables With Covariate Effects on the Manifest and Latent Variables: A Comparison of LISREL and IRT Approaches , 2004 .

[39]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[40]  D. R. Thomas,et al.  APPLYING ITEM RESPONSE THEORY METHODS TO COMPLEX SURVEY DATA , 2002 .

[41]  R. P. McDonald,et al.  Test Theory: A Unified Treatment , 1999 .

[42]  R. Jaeger,et al.  Chapter 11 : Persistent Methodological Questions in Educational Testing , 1999 .

[43]  B. Zumbo,et al.  An Empirical Test of Roskam's Conjecture about the Interpretation of an ICC Parameter in Personality Inventories , 1997 .

[44]  T. Cook,et al.  Quasi-experimentation: Design & analysis issues for field settings , 1979 .

[45]  K. Jöreskog,et al.  Factor Analysis of Ordinal Variables: A Comparison of Three Approaches , 2001, Multivariate behavioral research.

[46]  S. Messick Test validity and the ethics of assessment. , 1980 .

[47]  Proceedings of the Second International Tampere Conference in Statistics , 1989 .

[48]  Paul D. Nichols,et al.  A Framework for Developing Cognitively Diagnostic Assessments , 1994 .

[49]  Susan E. Embretson,et al.  Applications of Cognitive Design Systems to Test Development , 1994 .

[50]  Robert J. Mislevy,et al.  Randomization-based inference about latent variables from complex samples , 1991 .

[51]  L. Radloff The CES-D Scale , 1977 .

[52]  S. Messick THE STANDARD PROBLEM: MEANING AND VALUES IN MEASUREMENT AND EVALUATION , 1974 .

[53]  D. Borsboom,et al.  The concept of validity. , 2004, Psychological review.

[54]  L. Cronbach,et al.  Construct validity in psychological tests. , 1955, Psychological bulletin.

[55]  The Geometry of Probability, Statistics, and Test Theory , 2001 .

[56]  Bengt Muthén,et al.  Latent variable modeling in heterogeneous populations , 1989 .

[57]  H. Wainer Item and Test Bias , 2005 .

[58]  J. Carroll,et al.  A New Measure of Predictor Variable Importance in Multiple Regression , 1978 .

[59]  J. Greenberg,et al.  An Item Response Theory for Personality and Attitude Scales: Item Analysis Using Restricted Factor Analysis , 1983 .

[60]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[61]  Simo Puntanen,et al.  Proceedings of the Second International Tampere Conference in Statistics : University of Tampere, Tampere, Finland, 1-4 June 1987 , 1987 .

[62]  Aaron J. Ferguson,et al.  On the Utilization of Sample Weights in Latent Variable Models. , 1999 .

[63]  D. R. Thomas Interpreting Discriminant Functions: A Data Analytic Approach. , 1992, Multivariate behavioral research.