论文信息 - A Comparison of Several Goodness-of-Fit Statistics

A Comparison of Several Goodness-of-Fit Statistics

A study was conducted to evaluate four goodness- of-fit procedures using data simulation techniques. The procedures were evaluated using data generated ac cording to three different item response theory models and a factor analytic model. Three different distribu tions of ability were used, as were three different sam ple sizes. It was concluded that the likelihood ratio chi-square procedure yielded the fewest erroneous re jections of the hypothesis of fit, whereas Bock's chi- square procedure yielded the fewest erroneous accep tances of fit. It was found that sample sizes some where between 500 and 1,000 were best. Shifts in the mean of the ability distribution were found to cause minor fluctuations, but they did not appear to be a major issue.

Craig N. Mills | Robert L. McKinley | C. Mills | R. Mckinley

[1] R. Darrell Bock,et al. Estimating item parameters and latent ability when responses are scored in two or more nominal categories , 1972 .

[2] E. B. Andersen,et al. A goodness of fit test for the rasch model , 1973 .

[3] R J Wherry,et al. Generating multiple samples of multivariate data with arbitrary population parameters , 1965, Psychometrika.

[4] B. Wright,et al. Best test design , 1979 .

[5] R. D. Bock,et al. Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm , 1981 .

[6] P. Holland. When are item response models consistent with observed data? , 1981 .

[7] Ronald K. Hambleton,et al. Item Response Models , 1985 .

[8] P. Rosenbaum. Testing the conditional independence and monotonicity assumptions of item response theory , 1984 .

[9] Stephen E. Fienberg,et al. Discrete Multivariate Analysis: Theory and Practice , 1976 .

[10] W. M. Yen. Using Simulation Results to Choose a Latent Trait Model , 1981 .