Constructing Interpretable and Practical Subdomain Score Vertical Scales

[1]  J. Mckillip,et al.  Fundamentals of item response theory , 1993 .

[2]  Shelby J. Haberman,et al.  Reporting of Subscores Using Multidimensional Item Response Theory , 2010 .

[3]  David Thissen,et al.  Using the Testlet Response Model as a Shortcut to Multidimensional Item Response Theory Subscore Computation , 2013 .

[4]  Sandip Sinharay,et al.  How Often Do Subscores Have Added Value? Results from Operational and Simulated Data , 2010 .

[5]  Shelby J. Haberman,et al.  When Can Subscores Have Value? , 2008 .

[6]  J. de la Torre,et al.  A Comparison of Four Methods of IRT Subscoring , 2011 .

[7]  R. Tsutakawa,et al.  The effect of uncertainty of item parameter estimation on ability estimates , 1990 .

[8]  Yong Luo,et al.  Using the Stan Program for Bayesian Item Response Theory , 2018, Educational and psychological measurement.

[9]  Lihua Yao,et al.  A Multidimensional Item Response Modeling Approach for Improving Subscale Proficiency Estimation and Classification , 2007 .

[10]  Jan de Leeuw,et al.  On the relationship between item response theory and factor analysis of discretized variables , 1987 .

[11]  R. D. Bock,et al.  Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm , 1981 .

[12]  David Thissen,et al.  On the relationship between the higher-order factor model and the hierarchical factor model , 1999 .

[13]  Hung-Yu Huang,et al.  Higher-Order Item Response Models for Hierarchical Latent Traits , 2013 .

[14]  Terry A. Ackerman The Use of Unidimensional Parameter Estimates of Multidimensional Items in Adaptive Testing , 1991 .

[15]  R. C. Sykes,et al.  Concurrent and Separate Grade-Groups Linking Procedures for Vertical Scaling , 2008 .

[16]  F. Lord,et al.  An Empirical Study of the Stability of a Group Mean in Relation to the Distribution of Test Items Among Students , 1958 .

[17]  Shelby J. Haberman SUBSCORES AND VALIDITY , 2008 .

[18]  Tae-Je Seong Sensitivity of Marginal Maximum Likelihood Estimation of Item and Ability Parameters to the Characteristics of the Prior Ability Distributions , 1990 .

[19]  A Comparison of Approaches for Improving the Reliability of Objective Level Scores , 2010 .

[20]  George Engelhard,et al.  Full-Information Item Factor Analysis: Applications of EAP Scores , 1985 .

[21]  Ying Li,et al.  Exploring the Full-Information Bifactor Model in Vertical Scaling With Construct Shift , 2012 .

[22]  J. Edwards Genetic Epistemology , 1971 .

[23]  Tianyou Wang,et al.  Conditional Standard Errors of Measurement for Composite Scores Using IRT , 2012 .

[24]  Jimmy de la Torre,et al.  Simultaneous Estimation of Overall and Domain Abilities: A Higher-Order IRT Model Approach , 2009 .

[25]  David A. Harrison,et al.  Robustness of Irt Parameter Estimation to Violations of The Unidimensionality Assumption , 1986 .

[26]  Jungnam Kim A comparison of calibration methods and proficiency estimators for creating IRT vertical scales , 2007 .

[27]  Frederic M. Lord,et al.  ESTIMATING NORMS BY ITEM SAMPLING , 1961 .

[28]  Shelby J. Haberman,et al.  Do Adjusted Subscores Lack Validity? Don’t Blame the Messenger , 2011 .

[29]  De Ayala,et al.  The Theory and Practice of Item Response Theory , 2008 .

[30]  Robert J. Mislevy,et al.  Estimating Population Characteristics From Sparse Matrix Samples of Item Responses , 1992 .

[31]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[32]  Steven P. Reise,et al.  The role of the bifactor model in resolving dimensionality issues in health outcomes measures , 2007, Quality of Life Research.

[33]  Desa,et al.  Bi-factor Multidimensional Item Response Theory Modeling for Subscores Estimation, Reliability, and Classification , 2012 .

[34]  W. M. Yen Increasing item complexity: A possible cause of scale shrinkage for unidimensional item response theory , 1985 .

[35]  Thakur B. Karkee,et al.  Separate versus Concurrent Calibration Methods in Vertical Scaling. , 2003 .

[36]  H. Swaminathan,et al.  Bayesian estimation in the two-parameter logistic model , 1985 .

[37]  J. Koepfler Examining the Bifactor IRT Model for Vertical Scaling in K-12 Assessment , 2012 .

[38]  Investigation of Student Growth Recovery in a Fixed-Item Linking Procedure With a Fixed-Person Prior Distribution for Mixed-Format Test Data , 2005 .

[39]  Melvin R. Novick,et al.  Some latent train models and their use in inferring an examinee's ability , 1966 .

[40]  Seock-Ho Kim,et al.  A Comparison of Linking and Concurrent Calibration Under the Graded Response Model , 1997 .

[41]  Thomas R. Boucher,et al.  Test Equating, Scaling, and Linking: Methods and Practices , 2007 .

[42]  Michael J. Kolen,et al.  Comparisons of Methodologies and Results in Vertical Scaling for Educational Achievement Tests , 2007 .

[43]  E. Muraki,et al.  Full-Information Item Factor Analysis , 1988 .

[44]  S. Reise The Rediscovery of Bifactor Measurement Models , 2012 .

[45]  R. D. Bock,et al.  The Next Stage in Educational Assessment , 1982 .

[46]  Howard Wainer,et al.  Augmented Scores-"Borrowing Strength" to Compute Scores Based on Small Numbers ofltems , 2001 .

[47]  R. Lissitz,et al.  An Evaluation of the Accuracy of Multidimensional IRT Linking , 2000 .

[48]  David M. Shoemaker,et al.  Principles and procedures of multiple matrix sampling. , 1973 .

[49]  Calibration of Response Data Using MIRT Models With Simple and Mixed Structures , 2012 .

[50]  D. Andrich,et al.  Formalizing dimension and response violations of local independence in the unidimensional Rasch model. , 2008, Journal of applied measurement.

[51]  P. L. Adams THE ORIGINS OF INTELLIGENCE IN CHILDREN , 1976 .

[52]  Kyung Yong Kim IRT linking methods for the bifactor model: a special case of the two-tier item factor analysis model , 2017 .

[53]  S. Haberman,et al.  EQUATING OF SUBSCORES AND WEIGHTED AVERAGES UNDER THE NEAT DESIGN , 2011 .

[54]  Jimmy de la Torre,et al.  Parameter Estimation With Small Sample Size A Higher-Order IRT Model Approach , 2010 .

[55]  Thomas Patrick Proctor An investigation of the effects of varying the domain definition of science and method of scaling on a vertical scale , 2008 .

[56]  S. Haberman,et al.  An NCME Instructional Module on Subscores , 2011 .

[57]  Richard M. Luecht,et al.  Applications of Multidimensional Diagnostic Scoring for Certification and Licensure Tests. , 2003 .

[58]  J. Piaget The Growth Of Logical Thinking From Childhood To Adolescence: An Essay On The Construction Of Formal Operational Structures , 1958 .

[59]  Minjeong Jeon,et al.  A Third-Order Item Response Theory Model for Modeling the Effects of Domains and Subdomains in Large-Scale Educational Assessment Surveys , 2014 .

[60]  R. Philip Chalmers,et al.  mirt: A Multidimensional Item Response Theory Package for the R Environment , 2012 .

[61]  Li Cai,et al.  Generalized full-information item bifactor analysis. , 2011, Psychological methods.

[62]  R. Darrell Bock,et al.  IRT Estimation of Domain Scores , 1997 .

[63]  John K. Kruschke,et al.  Bayesian estimation in hierarchical models , 2015 .

[64]  Andrew P. Jaciw,et al.  Matrix Sampling of Items in Large-Scale Assessments , 2002 .

[65]  S. Reise,et al.  Bifactor Models and Rotations: Exploring the Extent to Which Multidimensional Data Yield Univocal Scale Scores , 2010, Journal of personality assessment.

[66]  A. Béguin,et al.  MCMC estimation and some model-fit analysis of multidimensional IRT models , 2001 .

[67]  Yu-Feng Chang A Restricted Bi-factor Model of Subdomain Relative Strengths and Weaknesses , 2015 .

[68]  M. Eastwood The Effects of Construct Shift and Model-Data Misfit on Estimates of Growth Using Vertical Scales , 2014 .

[69]  A Bayesian/IRT Index of Objective Performance for Tests with Mixed Item Types 1 , 1997 .

[70]  C. Parsons,et al.  Application of Unidimensional Item Response Theory Models to Multidimensional Data , 1983 .

[71]  I. Sun-GeunBaek Implications of Cognitive Psychology for Educational Testing , 1994 .

[72]  Z. Kablan,et al.  Science Achievement in TIMSS Cognitive Domains Based on Learning Styles. , 2013 .

[73]  Christopher K. Wikle,et al.  Bayesian Multidimensional IRT Models With a Hierarchical Structure , 2008 .

[74]  Carolyn A. Haug,et al.  Stability of School-Building Accountability Scores and Gains , 2002 .

[75]  Li Cai,et al.  A Two-Tier Full-Information Item Factor Analysis Model with Applications , 2010 .

[76]  Francis Tuerlinckx,et al.  A nonlinear mixed model framework for item response theory. , 2003, Psychological methods.

[77]  B. Muthén Latent variable structural equation modeling with categorical data , 1983 .

[78]  Lihua Yao Reporting Valid and Reliable Overall Scores and Domain Scores , 2010 .

[79]  Jiqiang Guo,et al.  Stan: A Probabilistic Programming Language. , 2017, Journal of statistical software.

[80]  Christine E. DeMars Scoring Subscales Using Multidimensional Item Response Theory Models. , 2005 .

[81]  David Thissen,et al.  Diagnostic Scores Augmented Using Multidimensional Item Response Theory: Preliminary Investigation of MCMC Strategies , 2005 .

[82]  D. Thissen,et al.  Factor analysis for items scored in two categories , 2000 .

[83]  Huijuan Meng,et al.  A comparison study of IRT calibration methods for mixed-format tests in vertical scaling , 2007 .