Assessing the discriminating power of item and test scores in the linear factor-analysis model

Modelbased attempts to rigorously study the broad and imprecise concept of 'discriminating power' are scarce, and generally limited to nonlinear models for binary responses. This paper proposes a comprehensive framework for assessing the discriminating power of item and test scores which are analyzed or obtained using Spearman's factoranalytic model. The proposed framework is organized on the basis of three criteria: (a) type of score, (b) range of discrimination, and (c) conceptualization and aspect that are measured. Within this framework, the functioning and interpretation of 16 measures, of which 6 appear to be new, are discussed, and the relations between them are established. The usefulness of the proposal in psychometric FA applications is illustrated by means of an empirical example. As several authors have pointed out (Loevinger, 1954; Lord & Novick, 1968; McDonald, 1999) the term "discriminating power" is rather imprecise. In a broad sense, it refers to the degree to which a score varies with trait level, as well as the effectiveness of this score to distinguish between respondents with a high trait level and respondents with a low trait level. This property is directly related to the quality of the score as a measure of the trait (Lord & Novick, 1968; McDonald, 1999) so it is of central practical importance, particularly in the context of item selection. For this reason, most research has focused on developing indices that are thought to express this property numerically, whereas more theoretically� oriented research is far scarcer. Below we provide a review of the literature that is most related to the present developments. The review is organized

[1]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[2]  Goldine C. Gleser,et al.  Maximizing the discriminating power of a multiple-score test , 1953 .

[3]  Mark D. Reckase,et al.  A Linear Logistic Multidimensional Model for Dichotomous Item Response Data , 1997 .

[4]  R. P. McDonald,et al.  Test Theory: A Unified Treatment , 1999 .

[5]  P. Ferrando,et al.  Theoretical and Empirical Comparisons between Two Models for Continuous Item Response , 2002, Multivariate behavioral research.

[6]  D. Lawley,et al.  XXIII.—On Problems connected with Item Selection and Test Construction , 1943, Proceedings of the Royal Society of Edinburgh. Section A. Mathematical and Physical Sciences.

[7]  G. J. Mellenbergh,et al.  A Unidimensional Latent Trait Model for Continuous Item Responses. , 1994, Multivariate behavioral research.

[8]  K. Jöreskog Statistical analysis of sets of congeneric tests , 1971 .

[9]  R. P. McDonald,et al.  A Basis for Multidimensional Item Response Theory , 2000 .

[10]  W. Thurlow Direct Measures of Discriminations Among Individuals Performed by Psychological Tests , 1950 .

[11]  G. A. Ferguson,et al.  On the theory of test discrimination , 1949, Psychometrika.

[12]  Akihito Kamata,et al.  A Note on the Relation Between Factor Analytic and Item Response Theory Models , 2008 .

[13]  T. Berge,et al.  How to score questionnaires , 1998 .

[14]  M. Hankins,et al.  Questionnaire discrimination: (re)-introducing coefficient δ , 2007, BMC medical research methodology.

[15]  J. Milholland The Reliability of Test Discriminations , 1955 .

[16]  J. Loevinger,et al.  The attenuation paradox in test theory. , 1954, Psychological bulletin.

[17]  Lee J. Cronbach,et al.  The Signal/Noise Ratio in the Comparison of Reliability Coefficients , 1964 .

[18]  An Index of the Discriminating Power of a Test at Different Parts of the Score Range , 1959 .

[19]  R. Dawes Fundamentals of attitude measurement , 1972 .

[20]  W. Meredith,et al.  Factorial Invariance: Historical Perspectives and New Problems , 2007 .

[21]  W. A. Nicewander Some relationships between the information function of IRT and the signal/noise ratio and reliability coefficient of classical test theory , 1993 .

[22]  Sten Henrysson The relation between factor loadings and biserial correlations in item analysis , 1962 .

[23]  R. Jackson RELIABILITY OF MENTAL TESTS , 1939 .

[24]  J. Greenberg,et al.  An Item Response Theory for Personality and Attitude Scales: Item Analysis Using Restricted Factor Analysis , 1983 .

[25]  Pere J. Ferrando,et al.  Difficulty, Discrimination, and Information Indices in the Linear Factor Analysis Model for Continuous Item Responses , 2009 .

[26]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[27]  C. Spearman General intelligence Objectively Determined and Measured , 1904 .

[28]  Gideon J. Mellenbergh,et al.  Measurement precision in test score and item response models , 1996 .

[29]  F. Lord A theory of test scores. , 1952 .

[30]  Cody Ding,et al.  Assessing Content Validity and Content Equivalence Using Structural Equation Modeling , 2002 .