Differential Weighting: A Review of Methods and Empirical Studies1

When measures are to be combined to form a composite measure or to predict a criterion, the question of differential weighting of the component measures arises. Can differential weighting improve the reliability of the composite or provide a more valid composite than would be obtained if the component measures were merely summed or averaged? Theoretically the answer to this question should be "yes." It is unlikely that the component measures will be equally reliable, have equal variances, be equally intercorrelated with one another, and be equally correlated with the underlying variable which the composite is to measure or with the exter­ nal criterion which is to be predicted. However, since each of these charac­ teristics of the component measures will be reflected in the composite measure, it is to be expected, on purely logical grounds, that differential weighting would be effective. When a criterion measure is available, multiple-regression techniques provide a set of weights optimal for minimizing the error of prediction for the group on which the weights were derived, under the usual assumptions of normality and linearity of regression. Or, alternatively, the weights may be chosen so as to maximize certain internal criteria such as the reliability of the composite measure. All methods weight most heavily those component measures which are "best" according to the particular criterion adopted, and they weight least, perhaps negatively, those which are "worst." McDonald (1968) offered a "unified treatment of the weighting prob­ lem," a general procedure for obtaining weighted linear combinations of variables. The procedure includes the following as special cases: multiple regression, canonical variate analysis, principal components, maximum reli­ ability, canonical factor analysis, and some other well-known methods. McDonald's procedure yields certain desirable invariance properties across transformations of the variables. Although the approach is not discussed

[1]  Robert L. Mccornack,et al.  A criticism of studies comparing item-weighting methods. , 1956 .

[2]  H. R. Douglass,et al.  Is it Necessary to Weight Exercises in Standard Tests , 1923 .

[3]  E. H. Staffelbach Weighting responses in true-false examinations. , 1930 .

[4]  R. P. McDonald,et al.  A unified treatment of the weighting problem , 1968, Psychometrika.

[5]  J. Stalnaker Weighting questions in the essay-type examination. , 1938 .

[6]  A. Yee,et al.  A NEW LOGICAL SCORING KEY FOR THE MINNESOTA TEACHER ATTITUDE INVENTORY1 , 1969 .

[7]  C. S. Bernhardson Determination of the Chance Score on the Three-Decision Multiple-Choice Test , 1966, Psychological reports.

[8]  D. G. Ryans An Analysis and Comparison of Certain Techniques for Weighting Criterion Data , 1954 .

[9]  Kate Hevner A Method of Correcting for Guessing in True-False Tests and Empirical Evidence in Support of IT , 1932 .

[10]  Harold A. Edgerton,et al.  The method of minimum variation for the combination of criteria , 1936 .

[11]  Melvin R. Novick,et al.  Some latent train models and their use in inferring an examinee's ability , 1966 .

[12]  N. Uhl,et al.  Predicting Shrinkage in the Multiple Correlation Coefficient , 1970 .

[13]  Jack C. Merwin Rational and mathematical relationships of six scoring procedures applicable to three-choice items. , 1959 .

[14]  R. Wherry Maximal weighting of qualitative data , 1944 .

[15]  Ronald K. Hambleton,et al.  A COMPARISON OF THE RELIABILITY AND VALIDITY OF TWO METHODS FOR ASSESSING PARTIAL KNOWLEDGE ON A MULTIPLE-CHOICE TEST , 1970 .

[16]  Frederick B. Davis,et al.  The Effect on Test Reliability and Validity of Scoring Aptitude and Achievement Tests With Weights for Every Choice , 1959 .

[17]  E. Cureton II. Approximate Linear Restraints and Best Predictor Weights , 1951 .

[18]  Paul Horst,et al.  Obtaining a composite measure from a number of different measures of the same attribute , 1936 .

[19]  C. S. Bernhardson Comparison of the Three-Decision and Conventional Multiple-Choice Tests , 1967, Psychological reports.

[20]  L. Thurstone,et al.  A scoring method for mental tests. , 1919 .

[21]  J. Guilford A simple scoring weight for test items and its reliability , 1941 .

[22]  J. G. Peatman The influence of weighted true-false test scores on grades. , 1930 .

[23]  Paul Horst,et al.  The prediction of personal adjustment. , 1942 .

[24]  E. Peel PREDICTION OF A COMPLEX CRITERION AND BATTERY RELIABILITY , 1948 .

[25]  H. Soderquist A new Method of Weighting Scores in a True-False Test , 1936 .

[26]  C. I. Mosier,et al.  On the reliability of a weighted composite , 1943 .

[27]  Wendell William Wright The development and use of a composite achievement test , 1929 .

[28]  S. Shiba A METHOD FOR SCORING MULTICATEGORY ITEMS , 1965 .

[29]  P. Herzberg The Parameters of Cross-Validation , 1967 .

[30]  Jack W. Dunlap,et al.  Derivation and application of a unit scoring system for the strong vocational interest blank for women , 1942 .

[31]  R. Yerkes A Point Scale for Measuring Mental Ability. , 1917, Proceedings of the National Academy of Sciences of the United States of America.

[32]  E. A. PEEL A Short Method for Calculating Maximum Battery Reliability , 1947, Nature.

[33]  Deriving a Composite Score From Several Measures of the Same Attribute1 , 1957 .

[34]  L. Wolins The Use of Multiple Regression Procedures When the Predictor Variables are Psychological Tests , 1967 .

[35]  John Schmid,et al.  Some Modifications of the Multiple-Choice Item , 1953 .

[36]  G. W. Snedecor Statistical Methods , 1964 .

[37]  C. F. Willey The Three-Decision Multiple-Choice Test: A Method of Increasing the Sensitivity of the Multiple-Choice Item , 1960 .

[38]  T. Cleary An individual differences model for multiple regression , 1966 .

[39]  Stephen M. Corey,et al.  Vocational interests of men and women. , 1944 .

[40]  P M Bentler,et al.  Alpha-maximized factor analysis (alphamax): Its relation to alpha and canonical factor analysis , 1968, Psychometrika.

[41]  C. W. Odell Further data concerning the effect of weighting exercises in new-type examinations. , 1931 .

[42]  Frederick B. Davis,et al.  Estimation and Use of Scoring Weights for Each Choice in Multiple-Choice Test Items , 1959 .

[43]  Paul I. Jacobs,et al.  Information in Wrong Responses , 1970 .

[44]  B. Green Best linear composites with a specified structure , 1969 .

[45]  Howard L. Jones,et al.  A New Evaluation Instrument , 1949 .

[46]  L. Nedelsky Ability to Avoid Gross Error as a Measure of Achievment , 1954 .

[47]  B. deFinetti,et al.  METHODS FOR DISCRIMINATING LEVELS OF PARTIAL KNOWLEDGE CONCERNING A TEST ITEM. , 1965, The British journal of mathematical and statistical psychology.

[48]  K. Holzinger An analysis of the errors in mental measurement. , 2022 .

[49]  L. Aiken Weighting and Guessing on Varieties of the Multiple-Choice Item , 1968 .

[50]  L. K. Waters The Utility of Importance Weights in Predicting Overall Job Satisfaction and Dissatisfaction , 1969 .

[51]  R. Pintner,et al.  A Standardization and Weighing of Two Hundred Analogies. , 1920 .

[52]  M. Slakter THE PENALTY FOR NOT GUESSING1 , 1968 .

[53]  T. L. Kelley The scoring of alternative responses with reference to some criterion. , 1934 .

[54]  J. Stanley,et al.  Restrictions on the Possible Values of r12, Given r13 and r23 , 1969 .

[55]  S. S. Wilks Weighting systems for linear functions of correlated variables when there is no dependent variable , 1938 .

[56]  Frederic M. Lord,et al.  An Analysis of the Verbal Scholastic Aptitude Test Using Birnbaum's Three-Parameter Logistic Model , 1968 .

[57]  Nora A. Congdon New weights for the responses in the Heilman Personal Data Scale. , 1941 .

[58]  Joan J. Michael THE RELIABILITY OF A MULTIPLE-CHOICE EXAMINATION UNDER VARIOUS TEST-TAKING INSTRUCTIONS1 , 1968 .

[59]  E. L. Clark A method of evaluating the units of a test. , 2022 .

[60]  Henry F. Kaiser,et al.  Alpha factor analysis , 1965, Psychometrika.

[61]  R. L. Winkler The Quantification of Judgment: Some Methodological Suggestions , 1967 .

[62]  James O. Ramsay A SCORING SYSTEM FOR MULTIPLE CHOICE TEST ITEMS , 1968 .

[63]  H. Hotelling The most predictable criterion. , 1935 .

[64]  A. Porter A Chi-Square Approach to Discrimination Among Occupations, Using an Interest Inventory. , 1967 .

[65]  G. Thomson WEIGHTING FOR BATTERY RELIABILITY AND PREDICTION , 1940 .

[66]  E. Strong,et al.  Proposed scoring changes for the Strong Vocational Interest Blank. , 1964 .

[67]  Clyde H. Coombs,et al.  The Assessment of Partial Knowledge1 , 1956 .

[68]  C. Jurgensen Item weights in employee rating scales. , 1955 .

[69]  S. M. Corey The effect of weighting exercises in a new type of examination. , 1930 .

[70]  R. Wherry,et al.  A New Formula for Predicting the Shrinkage of the Coefficient of Multiple Correlation , 1931 .

[71]  M. R. Novick,et al.  Statistical Theories of Mental Test Scores. , 1971 .

[72]  C. H. Lawshe,et al.  The Relative Efficiency of four Test Weighting Methods in Multiple Prediction , 1959 .

[73]  R. L. Winkler The Assessment of Prior Distributions in Bayesian Analysis , 1967 .

[74]  J. W. Dunlap,et al.  A simplified method for scoring the Strong Vocational Interest Blank. , 1941 .

[75]  P. Walmsley,et al.  Statistical Method , 1923, Nature.

[76]  C. H. Lawshe,et al.  The Method of Reciprocal Averages in Weighting Personnel Data , 1958 .

[77]  Julian C. Stanley,et al.  Weighting Test Items and Test-Item Options, an Overview of the Analytical and Empirical Literature , 1970 .

[78]  J. Terwilliger,et al.  AN EMPIRICAL STUDY OF THE EFFECTS OF STANDARDIZING SCORES IN THE FORMATION OF LINEAR COMPOSITES1 , 1969 .

[79]  E. H. Shuford,et al.  Admissible probability measurement procedures , 1966, Psychometrika.

[80]  R. Ebel CONFIDENCE WEIGHTING AND TEST RELIABILITY1 , 1965 .

[81]  LeVerne S. Collet,et al.  The Effects of Differing Instructions and Guessing Formulas on Reliability and Validity , 1968 .

[82]  R. Likert “Technique for the Measurement of Attitudes, A” , 2022, The SAGE Encyclopedia of Research Design.

[83]  L. Aiken Scoring for Partial Knowledge on the Generalized Rearrangement Item , 1970 .

[84]  F. Samejima Estimation of latent ability using a response pattern of graded scores , 1968 .

[85]  H. Gulliksen Theory of mental tests , 1952 .

[86]  A. Wesman,et al.  Multiple Regression vs. Simple Addition of Scores in Prediction of College Grades , 1959 .