Practical Issues in Equating
暂无分享,去创建一个
[1] M. J. Kolen. Population Invariance in Equating and Linking: Concept and History , 2004 .
[2] Harold F. O'Neil,et al. Effects of Motivational Interventions on the National Assessment of Educational Progress Mathematics Performance , 1995 .
[3] D. Whitney,et al. Comparison of Four Procedures for Equating the Tests of General Educational Development. , 1982 .
[4] R. Tate. Equating for Long-Term Scale Maintenance of Mixed Format Tests Containing Multiple Choice and Constructed Response Items , 2003 .
[5] Gautam Puhan. Detecting and Correcting Scale Drift in Test Equating: An Illustration from a Large Scale Testing Program , 2008 .
[6] Martha L. Stocking,et al. Practical Issues in Large-Scale Computerized Adaptive Testing , 1996 .
[7] Avi Allalouf,et al. Quality Control Procedures in the Scoring, Equating, and Reporting of Test Scores , 2007 .
[8] Samuel A. Livingston. Small‐Sample Equating With Log‐Linear Smoothing , 1993 .
[9] Mark D. Reckase,et al. Effect of the Medium of Item Presentation on Examinee Performance and Item Characteristics , 1989 .
[10] Samuel A. Livingston. ADJUSTING SCORES ON EXAMINATIONS OFFERING A CHOICE OF ESSAY QUESTIONS , 1988 .
[11] An NCME Instructional Module on Population Invariance in Linking and Equating , 2012 .
[12] Sooyeon Kim,et al. Linking Mixed‐Format Tests Using Multiple‐Choice Anchors , 2010 .
[13] P. Holland,et al. The Missing Data Assumptions of the NEAT Design and their Implications for Test Equating , 2010 .
[14] Christine E. DeMars,et al. Investigating the Impact of Compromised Anchor Items on IRT Equating Under the Nonequivalent Anchor Test Design , 2012 .
[15] N. Longford. Reliability of Essay Rating and Score Adjustment , 1994 .
[16] Ronald K. Hambleton,et al. Customized Tests and Customized Norms. , 1991 .
[17] Samuel A. Livingston,et al. What Combination of Sampling and Equating Methods Works Best , 1989 .
[18] Nancy L. Allen,et al. A MISSING DATA APPROACH TO ESTIMATING DISTRIBUTIONS OF SCORES FOR OPTIONAL TEST SECTIONS , 1994 .
[19] W. Angoff. Technical and Practical Issues in Equating: A Discussion of Four Papers , 1987 .
[20] K. Ricker,et al. SINGLE- VERSUS DOUBLE-SCORING OF TREND RESPONSES IN TREND SCORE EQUATING WITH CONSTRUCTED-RESPONSE TESTS , 2010 .
[21] N. Petersen,et al. A Test of the Adequacy of Curvilinear Score Equating Models , 1983 .
[22] P. Holland,et al. THE CORRELATION BETWEEN THE SCORES OF A TEST AND AN ANCHOR TEST , 2006 .
[23] B. Bridgeman,et al. THE EFFECT OF COMPUTER-BASED TESTS ON RACIAL/ETHNIC, GENDER, AND LANGUAGE GROUPS , 2000 .
[24] M. Lunz,et al. Equating Computerized Adaptive Certification Examinations: The Board of Registry Series of Studies. , 1995 .
[25] H. Huynh,et al. Contextual Characteristics of Locally Dependent Open-Ended Item Clusters in a Large-Scale Performance Assessment , 1997 .
[26] G. Engelhard,et al. The Effects of Task Choice on the Quality of Writing Obtained in a Statewide Assessment , 1995 .
[27] Cynthia G. Parshall,et al. Equating Error and Statistical Bias in Small Sample Linear Equating , 1995 .
[28] Betty A. Bergstrom,et al. An Empirical Study of Computerized Adaptive Test Administration Conditions. , 1994 .
[29] Mary Pommerich,et al. Developing Computerized Versions of Paper-and-Pencil Tests: Mode Effects for Passage-Based Tests , 2004 .
[30] Stephen G. Sireci,et al. The Impact of Multidirectional Item Parameter Drift on IRT Scaling Coefficients and Proficiency Estimates , 2012 .
[31] Rick Morgan,et al. EXPERIMENTAL STUDY OF THE EFFECTS OF CALCULATOR USE ON THE ADVANCED PLACEMENT CALCULUS EXAMINATIONS1 , 1991 .
[32] George Engelhard,et al. Evaluating Rater Accuracy in Performance Assessments. , 1996 .
[33] Mary Pommerich,et al. The Effect of Using Item Parameters Calibrated from Paper Administrations in Computer Adaptive Test Administrations , 2007 .
[34] Hong Jiao,et al. Comparability of Computer-Based and Paper-and-Pencil Testing in K–12 Reading Assessments , 2008 .
[35] Deborah J. Harris,et al. A Study of Criteria Used in Equating , 1993 .
[36] On bias in linear observed-score equating , 2010 .
[37] Tianyou Wang,et al. Evaluating Comparability in Computerized Adaptive Testing: Issues, Criteria and an Example , 2001 .
[38] D. Eignor. AN INVESTIGATION OF THE FEASIBILITY AND PRACTICAL OUTCOMES OF PRE‐EQUATING THE SAT VERBAL AND MATHEMATICAL SECTIONS1,2,3 , 1985 .
[39] Xiang-bo Wang. On the Viability of Some Untestable Assumptions in Equating Exams That Allow Examinee Choice. Program Statistics Research Technical Report No. 93-31. , 1993 .
[40] D. Budescu. Selecting an Equating Method: Linear or Equipercentile? , 1987 .
[41] B. Bridgeman,et al. Choice Among Essay Topics: Impact on Performance and Validity , 1997 .
[42] Bradley A. Hanson. A Comparison of Presmoothing and Postsmoothing Methods in Equipercentile Equating. ACT Research Report Series 94-4. , 1994 .
[43] Henry Braun,et al. Understanding Scoring Reliability: Experiments in Calibrating Essay Readers , 1988 .
[44] Brian D. Bontempo,et al. Repeater Patterns on NCLEX™ using CAT versus NCLEX™ using Paper-and-Pencil Testing , 1996 .
[45] N. Dorans,et al. Equating Test Scores: Toward Best Practices , 2009 .
[46] Multiple Linking in Equating and Random Scale Drift , 2011 .
[47] R. Jaeger. SOME EXPLORATORY INDICES FOR SELECTION OF A TEST EQUATING METHOD , 1981 .
[48] P. Holland,et al. Linking and aligning scores and scales , 2007 .
[49] H. Huynh,et al. Computer-Based and Paper-and-Pencil Administration Mode Effects on a Statewide End-of-Course English Test , 2008 .
[50] Harold F. O'Neil,et al. Policy and validity prospects for performance-based assessment. , 1993 .
[51] Susan R. Goldman,et al. Evaluation of Procedure-Based Scoring for Hands-On Science Assessment , 1992 .
[52] Insu Paek,et al. An Alternative to the Trend Scoring Method for Adjusting Scoring Shifts in Mixed-Format Tests , 2009 .
[53] B. Clauser,et al. The Impact of Statistically Adjusting for Rater Effects on Conditional Standard Errors of Performance Ratings , 2011 .
[54] George Engelhard,et al. Examining Rater Errors in the Assessment of Written Composition With a Many-Faceted Rasch Model , 1994 .
[55] A Discussion of Population Invariance of Equating , 2008 .
[56] Walter D. Way. Protecting the Integrity of Computerized Testing Item Pools , 1998 .
[57] P. Holland,et al. Observed Score Equating Using a Mini-Version Anchor and an Anchor with Less Spread of Difficulty: A Comparison Study , 2011 .
[58] Akihito Kamata,et al. The Performance of a Method for the Long‐term Equating of Mixed‐Format Assessment , 2005 .
[59] Linda L. Cook,et al. Simulation Results of Effects on Linear and Curvilinear Observed-and True-Score Equating Procedures of Matching on a Fallible Criterion , 1990 .
[60] Deborah J. Harris,et al. Comparison of Item Preequating and Random Groups Equating Using IRT and Equipercentile Methods , 1990 .
[61] Gerald C. Davison,et al. American Psychological Association (APA) , 2015 .
[62] Gautam Puhan. Impact of Inclusion or Exclusion of Repeaters on Test Equating , 2011 .
[63] Anthony R. Zara,et al. A Comparison of Procedures for Content-Sensitive Item Selection in Computerized Adaptive Tests. , 1991 .
[64] David M. Williamson,et al. A Framework for Evaluation and Use of Automated Scoring , 2012 .
[65] D. Jarjoura,et al. THE IMPORTANCE OF CONTENT REPRESENTATION FOR COMMON‐ITEM EQUATING WITH NONRANDOM GROUPS , 1985 .
[66] Michael E. Walker,et al. Score Linking Issues Related to Test Content Changes , 2007 .
[67] Sooyeon Kim,et al. Examining Two Strategies to Link Mixed-Format Tests Using Multiple-Choice Anchors. Research Report. ETS RR-10-18. , 2010 .
[68] Paul W. Holland,et al. Statistical models for test equating, scaling, and linking , 2011 .
[69] M. J. Kolen. Threats to Score Comparability with Applications to Performance Assessments and Computerized Adaptive Tests , 1999 .
[70] R. Brennan,et al. A Reply to Angoff , 1987 .
[71] D. Budescu. EFFICIENCY OF LINEAR EQUATING AS A FUNCTION OF THE LENGTH OF THE ANCHOR TEST , 1985 .
[72] Nancy S. Petersen. Equating: Best Practices and Challenges to Best Practices , 2007 .
[73] R. Brennan. A Discussion of Population Invariance , 2008 .
[74] G. Neuman,et al. Computerization of Paper-and-Pencil Tests: When are They Equivalent? , 1998 .
[75] Samuel A. Livingston,et al. An Evaluation of the Kernel Equating Method: A Special Study with Pseudotests Constructed from Real Test Data. Research Report. ETS RR-06-02. , 2006 .
[76] Cornelis A.W. Glas,et al. Computerized adaptive testing : theory and practice , 2000 .
[77] S. Haberman,et al. Small-Sample Equating Using a Synthetic Linking Function. , 2008 .
[78] Mark D. Reckase,et al. TECHNICAL GUIDELINES FOR ASSESSING COMPUTERIZED ADAPTIVE TESTS , 1984 .
[79] The Effectiveness of Circular Equating as a Criterion for Evaluating Equating , 2000 .
[80] Samuel A. Livingston,et al. Comparisons among Small Sample Equating Methods in a Common‐Item Design , 2010 .
[81] Neal M. Kingston. Comparability of Computer- and Paper-Administered Multiple-Choice Tests for K–12 Populations: A Synthesis , 2008 .
[82] Linda L. Cook,et al. Sensitivity of Equating Results to Different Sampling Strategies. , 1990 .
[83] Assessing Equating Results on Different Equating Criteria , 2005 .
[84] Martha L. Stocking. Revising Item Responses in Computerized Adaptive Tests: A Comparison of Three Models , 1997 .
[85] Willem J. van der Linden,et al. Local Observed-Score Equating , 2009 .
[86] Neal M. Kingston,et al. Item Location Effects and Their Implications for IRT Equating and Adaptive Testing , 1984 .
[87] W. D. Linden,et al. Local linear observed-score equating , 2011 .
[88] Walter P. Vispoel,et al. Reviewing and Changing Answers on Computer‐adaptive and Self‐adaptive Vocabulary Tests , 1998 .
[89] Dorothy T. Thayer,et al. The Chain and Post‐Stratification Methods for Observed‐Score Equating: Their Relationship to Population Invariance , 2004 .
[90] Leonard S. Cahen,et al. Educational Testing Service , 1970 .
[91] Warren W. Willingham,et al. Testing handicapped people , 1988 .
[92] Tianyou Wang,et al. Computerized Adaptive and Fixed‐Item Testing of Music Listening Skill: A Comparison of Efficiency, Precision, and Concurrent Validity , 1997 .
[93] Richard L. Tate. A Cautionary Note on IRT-Based Linking of Tests With Polytomous Items , 1999 .
[94] George Leckie,et al. Rater Effects on Essay Scoring: A Multilevel Analysis of Severity Drift, Central Tendency, and Rater Experience , 2011 .
[95] N. Dorans. Recentering and Realigning the SAT Score Distributions: How and Why. , 2002 .
[96] N. Dorans. Using Subpopulation Invariance to Assess Test Score Equity , 2004 .
[97] Neil J. Dorans,et al. THE EFFECTS OF ITEM REARRANGEMENT ON TEST PERFORMANCE: A REVIEW OF THE LITERATURE , 1982 .
[98] Frederic M. Lord,et al. Comparison of IRT True-Score and Equipercentile Observed-Score "Equatings" , 1984 .
[99] Fritz Drasgow,et al. Innovations in Computerized Assessment , 1999 .
[100] USING REPEATERS FOR ESTIMATING COMPARABLE SCORES , 1999 .
[101] Brian Rothschild,et al. Effects of Extended Time on the SAT I: Reasoning Test Score Growth for Students with Learning Disabilities , 1998 .
[102] G. C. Bussolino,et al. Long-term performance of a transfer standard pyrometer , 1990 .
[103] Marie Wiberg,et al. Observed Score Linear Equating with Covariates , 2011 .
[104] Howard Wainer,et al. SOME PRACTICAL CONSIDERATIONS WHEN CONVERTING A LINEARLY ADMINISTERED TEST TO AN ADAPTIVE FORMAT , 1992 .
[105] John Mazzeo. Comparability of Computer and Paper-and-Pencil Scores for Two CLEP General Examinations. College Board Report No. 91-5. , 1991 .
[106] Milton H Maier,et al. Military Aptitude Testing: The Past Fifty Years , 1993 .
[107] Gary L. Thomasson. The Goal of Equity within and between Computerized Adaptive Tests and Paper and Pencil Forms. , 1997 .
[108] F. Vijver,et al. The incomplete equivalence of the paper-and-pencil and computerized versions of the General Aptitude Test Battery , 1994 .
[109] Samuel A. Livingston,et al. Collateral Information for Equating in Small Samples: A Preliminary Investigation , 2011 .
[110] Robert W. Lissitz,et al. IRT Test Equating: Relevant Issues and a Review of Recent Research , 1986 .
[111] S. Sireci,et al. Evaluating the Comparability of Paper- and Computer-Based Science Tests across Sex and SES Subgroups. , 2012 .
[112] T. Hsu,et al. Exploring the Feasibility of Collateral Information Test Equating , 2002 .
[113] Anne L. Harvey,et al. The Equivalence of Scores from Automated and Conventional Educational and Psychological Tests: A Review of the Literature. College Board Report No. 88-8. , 1988 .
[114] Ronald K. Hambleton,et al. Consequences of Violated Equating Assumptions Under the Equivalent Groups Design , 2011 .
[115] M. J. Kolen,et al. The Effect of Repeaters on Equating , 2010 .
[116] A. A. Davier. Potential Solutions to Practical Equating Issues , 2007 .
[117] Wim J. van der Linden,et al. Local Observed-Score Equating With Anchor-Test Designs , 2010 .
[118] R. Hambleton,et al. Fundamentals of Item Response Theory , 1991 .
[119] David M. Williamson,et al. EVALUATION OF THE E‐RATER® SCORING ENGINE FOR THE GRE® ISSUE AND ARGUMENT PROMPTS , 2012 .
[120] Fritz Drasgow. The work ahead: A psychometric infrastructure for computerized adaptive tests , 2005 .
[121] Stability of Rasch Scales Over Time , 2009 .
[122] Chockalingam Viswesvaran,et al. Least Squares Models to Correct for Rater Effects in Performance Assessment , 1993 .
[123] R. Brennan,et al. Some Practical Issues in Equating , 1987 .
[124] Sooyeon Kim,et al. EVALUATING SUBPOPULATION INVARIANCE OF LINKING FUNCTIONS TO DETERMINE THE ANCHOR COMPOSITION FOR A MIXED‐FORMAT TEST , 2009 .
[125] Neil J. Dorans,et al. Sources of Score Scale Inconsistency , 2011 .
[126] Jinghua Liu,et al. A Scale Drift Study , 2009 .
[127] Vonda L. Kiplinger,et al. Raising the Stakes of Test Administration: The Impact on Student Performance on the National Assessment of Educational Progress. , 1995 .
[128] Neil J. Dorans,et al. Item Response Theory, Item Calibration, and Proficiency Estimation , 2000 .
[129] Michael J. Kolen,et al. Evaluation of Two New Smoothing Methods in Equating: The Cubic B-Spline Presmoothing Method and the Direct Presmoothing Method. , 2009 .
[130] Eric T. Bradlow,et al. Item Response Theory Models Applied to Data Allowing Examinee Choice , 1998 .
[131] Brent Bridgeman,et al. Comparison of Human and Machine Scoring of Essays: Differences by Gender, Ethnicity, and Country , 2012 .
[132] Michalis P. Michaelides. Sensitivity of Equated Aggregate Scores to the Treatment of Misbehaving Common Items , 2010 .
[133] Evaluating Equating Accuracy and Assumptions for Groups that Differ in Performance. , 2014 .
[134] Craig N. Mills,et al. FIELD TEST OF A COMPUTER-BASED GRE GENERAL TEST , 1993 .
[135] H. Huynh,et al. Equivalence of Paper-and-Pencil and Online Administration Modes of the Statewide English Test for Students With and Without Disabilities , 2010 .
[136] Andrew J. Poggio,et al. A Comparative Evaluation of Score Results from Computerized and Paper & Pencil Mathematics Testing in a Large Scale State Assessment Program , 2005 .
[137] Christine E. DeMars. Detection of Item Parameter Drift over Multiple Test Administrations , 2004 .
[138] Daniel R. Eignor,et al. DERIVING COMPARABLE SCORES FOR COMPUTER ADAPTIVE AND CONVENTIONAL TESTS: AN EXAMPLE USING THE SAT1,2 , 1993 .
[139] H. Leeson. The Mode Effect: A Literature Review of Human and Technological Issues in Computerized Testing , 2006 .
[140] Neil J. Dorans,et al. CONSISTENCY OF SAT® I: REASONING TEST SCORE CONVERSIONS , 2008 .
[141] Deborah J. Harris,et al. Psychometric Properties of Scale Scores and Performance Levels for Performance Assessments Using Polytomous IRT , 2000 .
[142] M. J. Kolen. Does Matching in Equating Work? A Discussion. , 1990 .
[143] The Optimal Degree of Smoothing in Equipercentile Equating with Postsmoothing. , 1995 .
[144] Linda L. Cook,et al. Irt Versus Conventional Equating Methods: A Comparative Study of Scale Stability , 1983 .
[145] Samuel A. Livingston,et al. New Approaches to Equating With Small Samples , 2009 .
[146] Hongwen Guo,et al. Accumulative Equating Error after a Chain of Linear Equatings , 2010 .
[147] Timothy D. Ritchie,et al. Factors in Paper-and-Pencil and Computer Reading Score Differences at the Primary Grades , 2006 .
[148] Mary E. Lunz,et al. The Effect of Review on the Psychometric Characteristics of Computerized Adaptive Tests. , 1994 .
[149] Equating error in observed-score equating , 2006 .
[150] Linda L. Cook,et al. SPECIFYING THE CHARACTERISTICS OF LINKING ITEMS USED FOR ITEM RESPONSE THEORY ITEM CALIBRATION1,2 , 1987 .
[151] Howard Wainer,et al. How Well Can We Compare Scores on Test Forms That Are Constructed by Examinees Choice , 1994 .
[152] Robert L. Linn,et al. High-Stakes Uses of Performance-Based Assessments , 1995 .
[153] P. Holland,et al. A New Approach to Comparing Several Equating Methods in the Context of the NEAT Design , 2010 .
[154] C. Glas,et al. Elements of adaptive testing , 2010 .
[155] ALTERNATIVE LOGLINEAR SMOOTHING MODELS AND THEIR EFFECT ON EQUATING FUNCTION ACCURACY , 2009 .
[156] Deborah J. Harris,et al. Effect of Examinee Group on Equating Relationships , 1986 .
[157] John W. Young,et al. The Cognitive Equivalence of Reading Comprehension Test Items Via Computerized and Paper-and-Pencil Administration , 2003 .
[158] Walter M. Houston,et al. Adjustments for Rater Effects in Performance Assessment , 1991 .
[159] Sooyeon Kim,et al. Investigating the Effectiveness of Equating Designs for Constructed‐Response Tests in Large‐Scale Assessments , 2010 .
[160] Randy Elliot Bennett,et al. Does it Matter if I take My Writing Test on Computer? An Empirical Study of Mode Effects in NAEP , 2006 .
[161] Neil J. Dorans,et al. Implications for Altering the Context in Which Test Items Appear: A Historical Perspective on an Immediate Concern , 1985 .
[162] Kadriye Ercikan,et al. Calibration and Scoring of Tests With Multiple-Choice and Constructed-Response Item Types , 1998 .
[163] Gary W. Phillips,et al. Technical Issues in Large-Scale Performance Assessment. , 1996 .
[164] E. Baker,et al. Impact of Accommodation Strategies on English Language Learners' Test Performance , 2005 .
[165] Cynthia G. Parshall,et al. Practical Considerations in Computer-Based Testing , 2002 .
[166] M. Pomplun. A Bifactor Analysis for a Mode-of-Administration Effect , 2007 .
[167] Brent Bridgeman,et al. Effects of Screen Size, Screen Resolution, and Display Rate on Computer-Based Test Performance , 2001 .
[168] Gregory J. Cizek,et al. The Effect of Altering the Position of Options in a Multiple-Choice Examination , 1994 .
[169] Gary A. Schaeffer. The Introduction and Comparability of the Computer Adaptive GRE General Test. GRE Board Professional Report No. 88-08aP. , 1995 .
[170] R. Brennan. Tests in Transition: Discussion and Synthesis , 2007 .
[171] Samuel A. Livingston,et al. Random‐Groups Equating with Samples of 50 to 400 Test Takers , 2010 .
[172] Y. Attali. Sequential Effects in Essay Ratings , 2011 .
[174] Wendy M. Yen,et al. The Psychometric Characteristics of Choice Items , 1995 .
[175] P. Holland,et al. The Effects of Selection Strategies for Bivariate Loglinear Smoothing Models on NEAT Equating Functions. , 2010 .
[176] Effect on Equating Results of Matching Samples on an Anchor Test. , 1990 .
[177] Luuk C. Rietveld,et al. Practical Aspects of Task Allocation in Design and Development of Digital Closed Questions in Higher Education , 2008 .
[178] James M. Royer,et al. Testing Accommodations for Examinees With Disabilities: A Review of Psychometric, Legal, and Social Policy Issues , 2001 .
[179] D. Eignor. Linking Scores Derived Under Different Modes of Test Administration , 2007 .
[180] Kevin C. Larkin,et al. SUBPOPULATION INVARIANCE OF EQUATING FUNCTIONS , 2006 .
[181] G. E. Miller,et al. Expected Equating Error Resulting From Incorrect Handling of Item Parameter Drift Among the Common Items , 2009 .
[182] Gerald E. DeMauro,et al. AN INVESTIGATION OF THE APPROPRIATENESS OF THE TOEFL TEST AS A MATCHING VARIABLE TO EQUATE TWE TOPICS , 1992 .
[183] R. Hambleton,et al. Evaluating Score Equity Assessment for State NAEP , 2009 .
[184] P. Congdon,et al. The Stability of Rater Severity in Large‐Scale Assessment Programs , 2000 .
[185] The Effects of Test Length and Sample Size on the Reliability and Equating of Tests Composed of Constructed-Response Items , 2001 .
[186] P. Cheng,et al. Estimating Comparable Scores Using Surrogate Variables , 2001 .
[187] Brent Bridgeman,et al. COMPARABILITY OF PAPER-AND-PENCIL AND COMPUTER ADAPTIVE TEST SCORES ON THE GRE® GENERAL TEST , 1998 .
[188] H. Wainer,et al. On Examinee Choice in Educational Testing , 1994 .
[189] W. R. Cowell,et al. AN EXAMINATION OF THE ASSUMPTION THAT THE EQUATING OF PARALLEL FORMS IS POPULATION‐INDEPENDENT , 1985 .
[190] H. Huynh,et al. A Comparison of Equal Percentile and Partial Credit Equatings for Performance-Based Assessments Composed of Free-Response Items. , 1994 .
[191] Invariance of Score Linkings Across Gender Groups for Forms of a Testlet-Based College-Level Examination Program Examination , 2008 .
[192] Jill Burstein,et al. Automated Essay Scoring : A Cross-disciplinary Perspective , 2003 .
[193] Michalis P. Michaelides,et al. An Illustration of a Mantel-Haenszel Procedure to Flag Misbehaving Common Items in Test Equating , 2008 .
[194] R. C. Sykes,et al. The Effects of Computer Administration on Scores and Item Parameter Estimates of an IRT-Based Licensure Examination , 1997 .
[195] Alina A. von Davier,et al. Practical Application of a Synthetic Linking Function on Small-Sample Equating , 2011 .
[196] THE EFFECTS ON OBSERVED- AND TRUE-SCORE EQUATING PROCEDURES OF MATCHING ON A FALLIBLE CRITERION: A SIMULATION WITH TEST VARIATION1 , 1990 .
[197] Robert C. Sykes,et al. The Scaling of Mixed-Item Format Tests with the One-Parameter and Two-Parameter Partial Credit Models. , 2000 .
[198] Linda L. Cook,et al. Problems Related to the Use of Conventional and Item Response Theory Equating Methods in Less Than Optimal Circumstances , 1987 .
[199] Robert L. Ziomek,et al. Predicting the College Grade Point Averages of Special-Tested Students from Their ACT Assessment Scores and High School Grades. , 1996 .
[200] A Graphical Approach to Evaluating Equating Using Test Characteristic Curves , 2011 .
[201] Tim Davey,et al. Computer-Adaptive Testing for Students with Disabilities: A Review of the Literature. Research Report. ETS RR-11-32. , 2011 .
[202] Two Approaches for Using Multiple Anchors in NEAT Equating , 2011 .
[203] Samuel A. Livingston,et al. The Circle-Arc Method for Equating in Small Samples , 2009 .
[204] Bruce Bloxom,et al. Operational Calibration of the Circular-Response Optical-Mark-Reader Answer Sheets for the Armed Services Vocational Aptitude Battery (ASVAB) , 1993 .
[205] Deniz S. Ones,et al. Psychometric equivalence of the computer and booklet forms of the MMPI: A meta-analysis , 1999 .
[206] Quantifying Equating Errors with Item Response Theory Methods , 1985 .
[207] H. Wainer,et al. Are Tests Comprising Both Multiple‐Choice and Free‐Response Items Necessarily Less Unidimensional Than Multiple‐Choice Tests?An Analysis of Two Tests , 1994 .
[208] Lixiong Gu,et al. Differential Item Functioning of GRE Mathematics Items across Computerized and Paper-and-Pencil Testing Media. , 2006 .
[209] Sooyeon Kim,et al. Evaluating the Comparability of Paper-and-Pencil and Computerized Versions of a Large-Scale Certification Test. Research Report. ETS RR-05-21. , 2005 .
[210] Manfred Steffen,et al. The GRE Computer Adaptive Test: Operational Issues , 2000 .
[211] Peter E. Kennedy,et al. Combining Multiple-Choice and Constructed-Response Test Scores: An Economist's View , 1997 .
[212] B. Loyd. Mathematics Test Performance: The Effects of Item Type and Calculator Use , 1991 .
[213] L. Crocker,et al. Achieving Form-to-Form Comparability: Fundamental issues and Proposed Strategies for Equating Performance Assessments of Teachers , 1995 .
[214] D. Borsboom. Educational Measurement (4th ed.) , 2009 .
[215] K. Ercikan,et al. The Consistency Between Raters Scoring in Different Test Years , 1998 .
[216] Walter D. Way. IRT Ability Estimates from Customized Achievement Tests Without Representative Content Sampling , 1989 .
[217] Martha L. Stocking,et al. A Method for Severely Constrained Item Selection in Adaptive Testing , 1992 .
[218] Masune Sukigara,et al. Equivalence between Computer and Booklet Administrations of the New Japanese Version of the MMPI , 1996 .
[219] Robert L. Brennan,et al. Conditional standard errors of measurement for scale scores using binomial and compund binomial assu , 1992 .
[220] P. Holland,et al. How to Average Equating Functions, If You Must , 2009 .
[221] Shudong Wang,et al. A Meta-Analysis of Testing Mode Effects in Grade K-12 Mathematics Tests , 2007 .
[222] Daniel O. Segall,et al. Equating the CAT-ASVAB. , 1997 .
[223] Willem J. van der Linden. Computerized adaptive testing with equated number-correct scoring , 2001 .
[224] R. Mckinley,et al. Reducing Test Form Overlap of the GRE Subject Test in Mathematics Using IRT Triple-Part Equating. GRE Board Professional Report No. 86-14P. , 1989 .
[225] Dorothy T. Thayer,et al. POPULATION INVARIANCE OF SCORE LINKING: THEORY AND APPLICATIONS TO ADVANCED PLACEMENT PROGRAM® EXAMINATIONS , 2003 .
[226] N. Dorans,et al. USING THE SELECTION VARIABLE FOR MATCHING OR EQUATING1,2 , 1993 .
[227] Linda L. Cook. Practical Problems in Equating Test Scores: A Practitioner’s Perspective , 2007 .
[228] T. Davey,et al. Potential Impact of Context Effects on the Scoring and Equating of the Multistage GRE® Revised General Test , 2011 .
[229] Exploring Population Sensitivity of Linking Functions Across Three Law School Admission Test Administrations , 2008 .
[230] James W Pellegrino,et al. Technology and Testing , 2009, Science.
[231] Mary E. Lunz,et al. Interjudge Reliability and Decision Reproducibility , 1994 .
[232] Katie Larsen McClarty,et al. Item-Level Comparative Analysis of Online and Paper Administrations of the Texas Assessment of Knowledge and Skills , 2008 .
[233] Wim J. van der Linden,et al. Capitalization on Item Calibration Error in Adaptive Testing , 1998 .
[234] N. Dorans,et al. Checking the Statistical Equivalence of Nearly Identical Test Editions , 1990 .
[235] S. Sinharay. Chain Equipercentile Equating and Frequency Estimation Equipercentile Equating: Comparisons Based on Real and Simulated Data , 2011 .
[236] W. D. Linden. Equating Scores from Adaptive to Linear Tests , 2006 .
[237] Shelby J. Haberman,et al. Limits on the Accuracy of Linking. Research Report. ETS RR-10-22. , 2010 .
[238] First Language of Test Takers and Fairness Assessment Procedures , 2011 .
[239] J. S. Gilmer. The Effects of Test Disclosure on Equated Scores and Pass Rates , 1989 .
[240] Wendy M. Yen,et al. Scaling Performance Assessments: Strategies for Managing Local Item Dependence , 1993 .
[241] The Impact of Item Deletion on Equating Conversions and Reported Score Distributions. , 1986 .
[242] F. Drasgow,et al. Equivalence of computerized and paper-and-pencil cognitive ability tests: A meta-analysis. , 1993 .
[243] D. D. Bickerstaff,et al. Computerized adaptive testing , 2015 .
[244] P. Holland,et al. Population Invariance and the Equatability of Tests: Basic Theory and The Linear Case , 2000 .
[245] Invariance of Equating Functions Across Different Subgroups of Examinees Taking a Science Achievement Test , 2008 .
[246] F. Drasgow,et al. Does computerizing paper-and-pencil job attitude scales make a difference? New IRT analyses offer insight. , 2000, The Journal of applied psychology.
[247] H. Wainer,et al. COMBINING MULTIPLE-CHOICE AND CONSTRUCTED RESPONSE TEST SCORES: TOWARD A MARXIST THEORY OF TEST CONSTRUCTION , 1992 .
[248] Gautam Puhan. A Comparison of Chained Linear and Poststratification Linear Equating under Different Testing Conditions. , 2010 .
[249] T. F. Donlon. The College Board technical handbook for the scholastic aptitude test and achievement tests , 1984 .
[250] Shelby J. Haberman,et al. Limits on the Accuracy of Linking , 2010 .
[251] Catherine M. Hombo,et al. Equating and Linking of Performance Assessments , 2000 .
[252] Jaeyool Boo,et al. Computerized and Paper-and-Pencil Versions of the Rosenberg Self-Esteem Scale: A Comparison of Psychometric Features and Respondent Preferences , 2001 .
[253] Wendy M. Yen,et al. The Maryland School Performance Assessment Program: Performance Assessment with Psychometric Quality Suitable for High Stakes Usage , 1997 .
[254] Stephen B. Dunbar,et al. Quality Control in the Development and Use of Performance Assessments , 1991 .
[255] Effects of Passage and Item Scrambling on Equating Relationships , 1991 .
[256] Selection Strategies for Univariate Loglinear Smoothing Models and Their Effect on Equating Function Accuracy , 2009 .
[257] Rebecca D. Hetter,et al. Evaluating item calibration medium in computerized adaptive testing. , 1997 .
[258] P. Holland,et al. An Approach to Evaluating the Missing Data Assumptions of the Chain and Post-stratification Equating Methods for the NEAT Design , 2008 .
[259] Hyeonjoo J. Oh,et al. The Effects of Essay Placement and Prompt Type on Performance on the New SAT , 2006 .
[260] Mark Wilson,et al. Complex Composites: Issues That Arise in Combining Different Modes of Assessment , 1995 .
[261] I. Lawrence,et al. LINKING SCORES FOR COMPUTER-ADAPTIVE AND PAPER-AND-PENCIL ADMINISTRATIONS OF THE SAT , 1997 .
[262] Does Linking Mixed-Format Tests Using a Multiple-Choice Anchor Produce Comparable Results for Male and Female Subgroups? , 2011 .
[263] INVARIANCE OF LINKINGS OF THE REVISED 2005 SAT REASONING TEST™ TO THE SAT® I: REASONING TEST ACROSS GENDER GROUPS , 2005 .
[264] R. Brennan,et al. A Comparison of the Frequency Estimation and Chained Equipercentile Methods Under the Common-Item Nonequivalent Groups Design , 2008 .
[265] HOW UNIDIMENSIONAL ARE TESTS COMPRISING BOTH MULTIPLE-CHOICE AND FREE-RESPONSE ITEMS? AN ANALYSIS OF TWO TESTS1 , 1993 .
[266] Walter P. Vispoel,et al. Individual Differences and Test Administration Procedures: A Comparison of Fixed-Item, Computerized-Adaptive, and Self-Adapted Testing. , 1994 .
[267] Anchor Test Type and Population Invariance: An Exploration Across Subpopulations and Test Administrations , 2008 .
[268] R. Mislevy. Linking Educational Assessments: Concepts, Issues, Methods, and Prospects. , 1992 .
[269] Robustness to Format Effects of IRT Linking Methods for Mixed-Format Tests , 2006 .
[270] Rebecca Zwick. Effects of Item Order and Context on Estimation of NAEP Reading Proficiency , 1991 .
[271] Robert J. Mislevy,et al. How to Equate Tests With Little or No Data , 1993 .
[272] Samuel A. Livingston,et al. A Case of Inconsistent Equatings: How the Man With Four Watches Decides What Time It Is , 2009 .
[273] Kathleen E. Moreno,et al. The Effects of Mode of Test Administration on Test Performance , 1986 .
[274] Cynthia G. Parshall,et al. Computer Testing versus Paper-and-Pencil Testing: An Analysis of Examinee Characteristics Associated with Mode Effect. , 1993 .
[275] R. Hambleton,et al. International Perspectives on Academic Assessment , 2012 .
[276] Martha L. Stocking. THREE PRACTICAL ISSUES FOR MODERN ADAPTIVE TESTING ITEM POOLS1 , 1994 .
[277] Sooyeon Kim,et al. Comparisons among Designs for Equating Mixed‐Format Tests in Large‐Scale Assessments , 2010 .
[278] Robert L. Brennan. The Context of Context Effects , 1992 .
[279] P. Holland,et al. Is It Necessary to Make Anchor Tests Mini-Versions of the Tests Being Equated or Can Some Restrictions Be Relaxed? , 2007 .
[280] Tim Moses. AN EVALUATION OF STATISTICAL STRATEGIES FOR MAKING EQUATING FUNCTION SELECTIONS , 2008 .
[281] Willem J. van der Linden,et al. Linear Models for Optimal Test Design , 2005 .
[282] Test Score Equating Using a Mini‐Version Anchor and a Midi Anchor: A Case Study Using SAT® Data , 2011 .
[283] M. J. Kolen,et al. Conditional Standard Errors of Measurement for Scale Scores Using IRT , 1996 .
[284] Anthony R. Zara,et al. Procedures for Selecting Items for Computerized Adaptive Tests. , 1989 .
[285] George Engelhard,et al. The Measurement of Writing Ability With a Many-Faceted Rasch Model , 1992 .
[286] N. Dorans. Equating Methods and Sampling Designs , 1990 .
[287] William A. Sands,et al. Computerized adaptive testing: From inquiry to operation. , 1997 .
[288] Richard L. Tate,et al. Performance of a Proposed Method for the Linking of Mixed Format Tests With Constructed Response and Multiple Choice Items , 2000 .