Cross-Validation

The development of the kernel equating (KE) method enhanced the theory of observed-score equating. In KE, discrete test score distributions are converted into continuous distributions through the use of a Gaussian kernel. Traditionally, the optimal bandwidth for a Gaussian kernel was obtained by minimizing a penalty function. In this article, an alternative bandwidth-selection approach for KE was adopted that uses cross-validation (CV) techniques. The method is illustrated through simulations that were conducted with 188 conditions by varying three factors known to influence equating results; these include sample sizes, score distributions, and methods that involve both equating and bandwidth-selection methods. Four equating procedures were considered: traditional equipercentile equating, which uses linear interpolation to make the test distributions continuous; KE with penalty functions; and KE with two newly proposed CV methods. The results were evaluated based on four criteria: bias in continuizing the distributions (i.e., the difference between the estimated and underlying score distributions), the standard error of equating (SEE), the difference between equated scores, and percent relative error (PRE). Overall, the results demonstrate that KE with the two CV methods outperformed the others—the estimated density functions were less biased and the SEEs and PREs were smaller. Equating differences between the different methods were produced, although they were not large. In addition, the bias issues surrounding kernel methods on sample sizes and the shapes of the distributions were addressed and discussed.

[1]  M. Rudemo Empirical Choice of Histograms and Kernel Density Estimators , 1982 .

[2]  P. Hall Large Sample Optimality of Least Squares Cross-Validation in Density Estimation , 1983 .

[3]  M. J. Kolen Effectiveness of Analytic Smoothing in Equipercentile Equating , 1984 .

[4]  A. Bowman An alternative method of cross-validation for the smoothing of density estimates , 1984 .

[5]  C. J. Stone,et al.  An Asymptotically Optimal Window Selection Rule for Kernel Density Estimates , 1984 .

[6]  James Stephen Marron,et al.  Comparison of data-driven bandwith selectors , 1988 .

[7]  Dorothy T. Thayer,et al.  THE KERNEL METHOD OF EQUATING SCORE DISTRIBUTIONS , 1989 .

[8]  Shean-Tsong Chiu,et al.  Bandwidth selection for kernel density estimation , 1991 .

[9]  M. J. Kolen Smoothing Methods for Estimating Test Score Distributions , 1991 .

[10]  M. C. Jones,et al.  A reliable data-based bandwidth selection method for kernel density estimation , 1991 .

[11]  M. C. Jones,et al.  On optimal data-based bandwidth selection in kernel density estimation , 1991 .

[12]  J. Marron,et al.  Smoothed cross-validation , 1992 .

[13]  Samuel A. Livingston AN EMPIRICAL TRYOUT OF KERNEL EQUATING , 1993 .

[14]  Bradley A. Hanson A Comparison of Presmoothing and Postsmoothing Methods in Equipercentile Equating. ACT Research Report Series 94-4. , 1994 .

[15]  Shameem Nyla NATIONAL COUNCIL ON MEASUREMENT IN EDUCATION , 2004 .

[16]  Dorothy T. Thayer,et al.  The Kernel Method of Test Equating , 2003 .

[17]  R. Brennan,et al.  Test Equating, Scaling, and Linking , 2004 .

[18]  Book Review: The Kernel Method of Test Equatingvon by A.A. Davier, P.W. Holland & D.T. Thayer, New York: Springer, 2004, ISBN 0-387-01985-5 , 2006 .

[19]  Michael J. Kolen,et al.  The kernel method of test equating , 2006 .

[20]  THE IMPACT OF ANCHOR TEST LENGTH ON EQUATING RESULTS IN A NONEQUIVALENT GROUPS DESIGN , 2007 .

[21]  Kelly Elizabeth Godfrey,et al.  A comparison of kernel equating and IRT true score equating methods , 2007 .

[22]  A. A. Davier,et al.  A Statistical Perspective on Equating Test Scores , 2009 .

[23]  Yi-Hsuan Lee,et al.  Equating Through Alternative Kernels , 2009 .

[24]  H. Kile,et al.  Bandwidth Selection in Kernel Density Estimation , 2010 .

[25]  Paul W. Holland,et al.  Statistical models for test equating, scaling, and linking , 2011 .

[26]  A. A. Davier,et al.  Observed-Score Equating: An Overview , 2013, Psychometrika.

[27]  Alina A von Davier,et al.  Examining Potential Boundary Bias Effects in Kernel Smoothing on Equating , 2015, Applied psychological measurement.