Explanatory Secondary Dimension Modeling of Latent Differential Item Functioning

The models used in this article are secondary dimension mixture models with the potential to explain differential item functioning (DIF) between latent classes, called latent DIF. The focus is on models with a secondary dimension that is at the same time specific to the DIF latent class and linked to an item property. A description of the models is provided along with a means of estimating model parameters using easily available software and a description of how the models behave in two applications. One application concerns a test that is sensitive to speededness and the other is based on an arithmetic operations test where the division items show latent DIF.

[1]  G. H. Fischer,et al.  The linear logistic test model as an instrument in educational research , 1973 .

[2]  E. Muraki A Generalized Partial Credit Model: Application of an EM Algorithm , 1992 .

[3]  F. Kaiser,et al.  Reviving Campbell’s Paradigm for Attitude Research , 2010, Personality and social psychology review : an official journal of the Society for Personality and Social Psychology, Inc.

[4]  Faith K. Greulich,et al.  Meeting goals and confronting conflict: children's changing perceptions of social comparison. , 1995, Child development.

[5]  H. J. Rogers,et al.  Guessing in Multiple Choice Tests , 1999 .

[6]  Kathleen Scalise,et al.  Assessment to improve learning in higher education: The BEAR Assessment System , 2006 .

[7]  Seoung-Hey Paik,et al.  K‐8th grade Korean students' conceptions of ‘changes of state’ and ‘conditions for changes of state’ , 2004 .

[8]  Eric T. Bradlow,et al.  A General Bayesian Model for Testlets: Theory and Applications , 2002 .

[9]  Paul Kline,et al.  A Handbook of Test Construction : Introduction to Psychometric Design , 1987 .

[10]  Harold W. Goldstein,et al.  Examining the Relationship Between Race-Based Differential Item Functioning and Item Difficulty , 2008 .

[11]  Brian F. Patterson,et al.  Differential Validity and Prediction of the SAT , 2008 .

[12]  P. Black,et al.  Meanings and Consequences: a basis for distinguishing formative and summative functions of assessment? , 1996 .

[13]  Carole A. Bleistein,et al.  FACTORS AFFECTING DIFFERENTIAL ITEM FUNCTIONING FOR BLACK EXAMINEES ON SCHOLASTIC APTITUDE TEST ANALOGY ITEMS1 , 1987 .

[14]  L. Zabin Ambivalent feelings about parenthood may lead to inconsistent contraceptive use--and pregnancy. , 1999, Family planning perspectives.

[15]  Wen-Chung Wang,et al.  The Rasch Testlet Model , 2005 .

[16]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[17]  Howard Wainer,et al.  Use of item response theory in the study of group differences in trace lines. , 1988 .

[18]  R. Bohrer Multiple Three-Decision Rules for Parametric Signs , 1979 .

[19]  J. Tukey The Philosophy of Multiple Comparisons , 1991 .

[20]  Alipaşa Ayas,et al.  Evaporation in different liquids: secondary students’ conceptions , 2005 .

[21]  R. Butler ENHANCING AND UNDERMINING INTRINSIC MOTIVATION: THE EFFECTS OF TASK‐INVOLVING AND EGO‐INVOLVING EVALUATION ON INTEREST AND PERFORMANCE , 1988 .

[22]  W. Damon,et al.  Self-understanding in childhood and adolescence , 1988 .

[23]  Eric T. Bradlow,et al.  A Bayesian random effects model for testlets , 1999 .

[24]  Robert L. Leahy,et al.  The Construction of the Self: A Developmental Perspective , 2001, Journal of Cognitive Psychotherapy.

[25]  A. Bankole,et al.  THE CONSISTENCY AND VALIDITY OF REPRODUCTIVE ATTITUDES: EVIDENCE FROM MOROCCO , 1998, Journal of Biosocial Science.

[26]  Karen Draney,et al.  Mapping multiple dimensions of student learning: the ConstructMap program. , 2009, Journal of applied measurement.

[27]  T. McDonald,et al.  The San Diego Striving Readers' Project: Building Academic Success for Adolescent Readers , 2009 .

[28]  Derek C. Briggs,et al.  The Impact of Vertical Scaling Decisions on Growth Interpretations. , 2009 .

[29]  G. Tutz Sequential item response models with an ordered response , 1990 .

[30]  Anthony S. Travis,et al.  Children's Views Concerning Phase Changes. , 1991 .

[31]  K. Campbell,et al.  A conceptual model for interprofessional education: The international classification of functioning, disability and health (ICF) , 2006, Journal of interprofessional care.

[32]  M. Meulders,et al.  A conceptual and psychometric framework for distinguishing categories and dimensions. , 2005, Psychological review.

[33]  P. Bentler,et al.  Significance Tests and Goodness of Fit in the Analysis of Covariance Structures , 1980 .

[34]  A. Jette,et al.  Are the ICF Activity and Participation dimensions distinct? , 2003, Journal of rehabilitation medicine.

[35]  H. Schweingruber,et al.  TAKING SCIENCE TO SCHOOL: LEARNING AND TEACHING SCIENCE IN GRADES K-8 , 2007 .

[36]  R. Alexander Towards Dialogic Teaching: Rethinking Classroom Talk , 2008 .

[37]  E. Grill,et al.  ICF Core Sets development for the acute hospital and early post-acute rehabilitation facilities , 2005, Disability and rehabilitation.

[38]  Randall D. Penfield Modeling DIF Effects Using Distractor-Level Invariance Effects: Implications for Understanding the Causes of DIF , 2010 .

[39]  G. Masters,et al.  Rating Scale Analysis. Rasch Measurement. , 1983 .

[40]  James W. Pellegrino,et al.  Addressing the "Two Disciplines" Problem: Linking Theories of Cognition and Learning With Assessment and Instructional Practice , 1957 .

[41]  J. Steenkamp,et al.  Assessing Measurement Invariance in Cross-National Consumer Research , 1998 .

[42]  S. Messick Validity of Psychological Assessment: Validation of Inferences from Persons' Responses and Performances as Scientific Inquiry into Score Meaning. Research Report RR-94-45. , 1994 .

[43]  Jeremy Hodgen,et al.  Validity in teachers’ summative assessments , 2010 .

[44]  Allan S. Cohen,et al.  A Multilevel Mixture IRT Model With an Application to DIF , 2010 .

[45]  Dorothy T. Thayer,et al.  Differential Item Performance and the Mantel-Haenszel Procedure. , 1986 .

[46]  B. Hanson Uniform DIF and DIF Defined by Differences in Item Response Functions , 1998 .

[47]  W. Stout,et al.  An Item Response Theory Model for Test Bias. , 1991 .

[48]  A. Tversky,et al.  Foundations of Measurement, Vol. I: Additive and Polynomial Representations , 1991 .

[49]  Raymond J. Adams,et al.  Rasch models for item bundles , 1995 .

[50]  Ann L. Brown,et al.  How people learn: Brain, mind, experience, and school. , 1999 .

[51]  H. Wainer,et al.  Differential Item Functioning. , 1994 .

[52]  J. Casterline,et al.  The estimation of Unwanted Fertility , 2007, Demography.

[53]  Mark Wilson,et al.  Measuring Progressions: Assessment Structures Underlying a Learning Progression , 2009 .

[54]  Gaea Leinhardt,et al.  Functions, Graphs, and Graphing: Tasks, Learning, and Teaching , 1990 .

[55]  Florian G. Kaiser,et al.  A General Measure of Ecological Behavior1 , 1998 .

[56]  Leigh Burstein,et al.  Instructionally Sensitive Psychometrics: Application of a New IRT‐Based Detection Technique to Mathematics Achievement Test Items , 1991 .

[57]  Sophia Rabe-Hesketh,et al.  Generalized latent variable models: multilevel, longitudinal, and structural equation models , 2004 .

[58]  Brian E. Clauser,et al.  Using Statistical Procedures to Identify Differentially Functioning Test Items , 2005 .

[59]  Jin-Yi Chang Teachers college students' conceptions about evaporation, condensation, and boiling , 1999 .

[60]  G. H. Fischer,et al.  Logistic latent trait models with linear constraints , 1983 .

[61]  Mark R. Wilson,et al.  Towards Coherence between Classroom Assessment and Accountability: 103rd Yearbook of the National Society for the Study of Education, Part II , 2005 .

[62]  M. Patton,et al.  Qualitative evaluation methods , 1981 .

[63]  Menucha Birenbaum,et al.  On the Stability of Students’Rules of Operation for Solving Arithmetic Problems , 1989 .

[64]  Howard Wainer,et al.  Using a New Statistical Model for Testlets to Score TOEFL , 2000 .

[65]  W. Frontera,et al.  Publishing in physical and rehabilitation medicine. , 2008, American journal of physical medicine & rehabilitation.

[66]  J. Stanford,et al.  Exploring the concepts of intended, planned, and wanted pregnancy. , 1999, The Journal of family practice.

[67]  C. Tebé,et al.  Gender differences in health-related quality of life among the elderly: the role of objective functional capacity and chronic conditions. , 2006, Social science & medicine.

[68]  V. Corral-Verdugo A Structural Model of Proenvironmental Competency , 2002 .

[69]  Gregory Camilli,et al.  A Conceptual Analysis of Differential Item Functioning in Terms of a Multidimensional Item Response Model , 1992 .

[70]  M. J. Kolen Linking Assessments: Concept and History , 2004 .

[71]  R. Zwick,et al.  Assessment of Differential Item Functioning for Performance Tasks , 1993 .

[72]  Allan S. Cohen,et al.  A Mixture Model Analysis of Differential Item Functioning , 2005 .

[73]  Paul J. Feltovich,et al.  Categorization and Representation of Physics Problems by Experts and Novices , 1981, Cogn. Sci..

[74]  C. Spearman,et al.  Demonstration of Formulae for True Measurement of Correlation , 1907 .

[75]  C. Dweck,et al.  A social-cognitive approach to motivation and personality , 1988 .

[76]  Wen-Chung Wang,et al.  Assessment of differential item functioning. , 2008, Journal of applied measurement.

[77]  S. Bamberg,et al.  Twenty years after Hines, Hungerford, and Tomera: A new meta-analysis of psycho-social determinants of pro-environmental behaviour , 2007 .

[78]  A. Su,et al.  The National Council of Teachers of Mathematics , 1932, The Mathematical Gazette.

[79]  T. B. Üstün,et al.  Development of ICF Core Sets for patients with chronic conditions. , 2004, Journal of rehabilitation medicine.

[80]  W. Miller,et al.  A framework for modelling fertility motivation in couples , 2004, Population studies.

[81]  Torsten Husén,et al.  The international encyclopedia of education : research and studies , 1985 .

[82]  J. Horn,et al.  A practical and theoretical guide to measurement invariance in aging research. , 1992, Experimental aging research.

[83]  L. Resnick,et al.  Assessing the Thinking Curriculum: New Tools for Educational Reform , 1992 .

[84]  Katherine E. Masyn,et al.  General growth mixture modeling for randomized preventive interventions. , 2001, Biostatistics.

[85]  Laura M. Stapleton,et al.  Differential Item Functioning: A Mixture Distribution Conceptualization , 2002 .

[86]  R. Land Threshold Concepts and Troublesome Knowledge (1): linkages to ways of thinking and practising within the disciplines , 2003 .

[87]  F. Marton Phenomenography — Describing conceptions of the world around us , 1981 .

[88]  William Stout,et al.  A nonparametric approach for assessing latent trait unidimensionality , 1987 .

[89]  Kenneth A. Bollen,et al.  Structural Equations with Latent Variables , 1989 .

[90]  Gregory J. Kelly,et al.  Epistemic levels in argument: An analysis of university oceanography students' use of evidence in writing , 2002 .

[91]  D. A. Kenny,et al.  Correlation and Causation , 1937, Wilmott.

[92]  R. Millsap,et al.  Evaluating the impact of partial factorial invariance on selection in two populations. , 2004, Psychological methods.

[93]  M. Kane Current Concerns in Validity Theory , 2001 .

[94]  Mark R. Wilson,et al.  The Ordered artition Model: An Extension of the Partial Credit Model , 1992 .

[95]  John B. Willett,et al.  Using Covariance Structure Analysis to Model Change over Time , 2000 .

[96]  Robert W. Lissitz,et al.  The concept of validity : revisions, new directions, and applications , 2009 .

[97]  R. Zwick When Do Item Response Function and Mantel-Haenszel Definitions of Differential Item Functioning Coincide? , 1990 .

[98]  F. Kaiser,et al.  Ecological behavior's dependency on different forms of knowledge , 2003 .

[99]  Paul F. Lazarsfeld,et al.  Latent Structure Analysis. , 1969 .

[100]  Michael Shayer,et al.  Not just Piaget; not just Vygotsky, and certainly not Vygotsky as alternative to Piaget , 2003 .

[101]  Peter Congdon,et al.  Applied Bayesian Modelling , 2003 .

[102]  A. Hubbard,et al.  Predictive ability and stability of pregnancy intentions measures: a longitudinal analysis of adolescent boys and girls , 2010 .

[103]  K. Ross,et al.  Children's naive ideas about melting and freezing , 2003 .

[104]  J. Kagan The theoretical utility of constructs for self , 1991 .

[105]  Mark Wilson,et al.  A Technique for Setting Standards and Maintaining Them over Time , 2002 .

[106]  Sarah R. Crissey Effect of Pregnancy Intention on Child Well-Being and Development: Combining Retrospective Reports of Attitude and Contraceptive use , 2005 .

[107]  Florian G. Kaiser,et al.  Behavior-based environmental attitude : development of an instrument for adolescents , 2007 .

[108]  Jennifer Caroline Greene,et al.  Crafting mixed‐method evaluation designs , 1997 .

[109]  Weimo Zhu A confirmatory study of Rasch-based optimal categorization of a rating scale. , 2002, Journal of applied measurement.

[110]  R. Almond,et al.  Making Sense of Data From Complex Assessments , 2002 .

[111]  M. Moos,et al.  Pregnant women's perspectives on intendedness of pregnancy. , 1997, Women's health issues : official publication of the Jacobs Institute of Women's Health.

[112]  C. Sherbourne,et al.  The MOS 36-Item Short-Form Health Survey (SF-36) , 1992 .

[113]  Alan M Jette,et al.  Blending activity and participation sub-domains of the ICF , 2007, Disability and rehabilitation.

[114]  F. J. Carod-Artal,et al.  Functional recovery and instrumental activities of daily living: follow-up 1-year after treatment in a stroke unit , 2002, Brain injury.

[115]  S. Newcomer,et al.  Intended pregnancies and unintended pregnancies: distinct categories or opposite ends of a continuum? , 1999, Family planning perspectives.

[116]  Kathleenl N. Lohr,et al.  Assessing health status and quality-of-life instruments: Attributes and review criteria , 2002, Quality of Life Research.

[117]  Brenda R. J. Jansen,et al.  The development of children's rule use on the balance scale task. , 2002, Journal of experimental child psychology.

[118]  C. Rocca,et al.  Challenging assumptions about women's empowerment: social and economic resources and domestic violence among young married women in urban South India. , 2009, International journal of epidemiology.

[119]  R. Freedle Correcting the SAT's ethnic and social-class bias: A method for reestimating SAT scores. , 2003 .

[120]  Laura M. Stapleton,et al.  Differential Item Functioning: A Mixture Distribution Conceptualization , 2002 .

[121]  D. Andrich A rating formulation for ordered response categories , 1978 .

[122]  Matthias von Davier,et al.  Measuring Growth in a Longitudinal Large-Scale Assessment with a General Latent Variable Model , 2011 .

[123]  Michael Shayer,et al.  Towards a science of science teaching , 1981 .

[124]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[125]  Robert B. Frary,et al.  Formula Scoring of Multiple‐Choice Tests (Correction for Guessing) , 1988 .

[126]  K. Draney,et al.  Investigating the saltus model as a tool for setting standards , 2011 .

[127]  G. Masters A rasch model for partial credit scoring , 1982 .

[128]  Allan S. Cohen,et al.  A Method for Maintaining Scale Stability in the Presence of Test Speededness , 2003 .

[129]  David Foster,et al.  The Mathematics Assessment Collaborative: Performance Testing to Improve Instruction , 2004 .

[130]  P. Deb Finite Mixture Models , 2008 .

[131]  R. MacCallum,et al.  Applications of structural equation modeling in psychological research. , 2000, Annual review of psychology.

[132]  N. Dorans Further Comment: Freedle's Table 2: Fact or Fiction? , 2004 .

[133]  P. Kline The New Psychometrics: Science, Psychology and Measurement , 1998 .

[134]  R. Freedle On Replicating Ethnic Test Bias Effects: The Santelices and Wilson Study. , 2010 .

[135]  J. Baxter Children's understanding of astronomy and the earth sciences. , 2012 .

[136]  C. Vlek,et al.  Measurement and Determinants of Environmentally Significant Consumer Behavior , 2002 .

[137]  Cees A. W. Glas,et al.  Application of Multidimensional Item Response Theory Models to Longitudinal Data , 2006 .

[138]  Gerhard H. Fischer,et al.  Derivations of the Rasch Model , 1995 .

[139]  G. H. Fischer,et al.  The Derivation of Polytomous Rasch Models , 1995 .

[140]  Mark R. Wilson,et al.  Marginal Maximum Likelihood Estimation for the Ordered Partition Model , 1993 .

[141]  Russell Tytler,et al.  A Longitudinal Study of Children’s Developing Knowledge and Reasoning in Science , 2005 .

[142]  Gerhard H. Fischer,et al.  Linear Logistic Models for Change , 1995 .

[143]  R. Siegler Developmental Sequences within and between Concepts. , 1981 .

[144]  J. Stanford,et al.  Are all contraceptive failures unintended pregnancies? Evidence from the 1995 National Survey of Family Growth. , 1999, Family planning perspectives.

[145]  S. Erduran,et al.  TAPping into argumentation: Developments in the application of Toulmin's Argument Pattern for studying science discourse , 2004 .

[146]  R. Hambleton,et al.  Fundamentals of Item Response Theory , 1991 .

[147]  R. Butler Task-involving and ego-involving properties of evaluation: Effects of different feedback conditions on motivational perceptions, interest, and performance. , 1987 .

[148]  D. Andrich The Rasch Model Explained , 2005 .

[149]  Gregory J. Cizek,et al.  Setting performance standards : concepts, methods, and perspectives , 2001 .

[150]  Georg Rasch,et al.  Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[151]  Charles Lewis,et al.  A Nonparametric Approach to the Analysis of Dichotomous Item Responses , 1982 .

[152]  Furong Gao,et al.  Investigating Local Dependence With Conditional Covariance Functions , 1998 .

[153]  Holmes Finch,et al.  The MIMIC Model as a Method for Detecting DIF: Comparison With Mantel-Haenszel, SIBTEST, and the IRT Likelihood Ratio , 2005 .

[154]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[155]  Steven J. Osterlind,et al.  Constructing Test Items: Multiple-Choice, Constructed-Response, Performance and Other Formats , 2006 .

[156]  Machteld Hoskens,et al.  The Rater Bundle Model , 2001 .

[157]  Nenad Kostanjsek,et al.  ICF Core Sets for stroke. , 2004, Journal of rehabilitation medicine.

[158]  Ralph W. Tyler,et al.  Basic Principles of Curriculum and Instruction , 1969 .

[159]  Taking Chances: Abortion and the Decision not to Contracept. , 1977 .

[160]  W. Meredith Measurement invariance, factor analysis and factorial invariance , 1993 .

[161]  Sylvia Frühwirth-Schnatter,et al.  Finite Mixture and Markov Switching Models , 2006 .

[162]  B. Tabachnick,et al.  Using multivariate statistics, 5th ed. , 2007 .

[163]  T. Joyce,et al.  The Stability of Pregnancy Intentions and Pregnancy-Related Maternal Behaviors , 2000, Maternal and Child Health Journal.

[164]  T. C. Oshima,et al.  Multidimensionality and Item Bias in Item Response Theory , 1992 .

[165]  Derek C. Briggs,et al.  Generalizability in Item Response Modeling , 2007 .

[166]  John Fox,et al.  TEACHER'S CORNER: Structural Equation Modeling With the sem Package in R , 2006 .

[167]  Jean-Paul Fox,et al.  Bayesian modeling of measurement error in predictor variables using item response theory , 2003 .

[168]  Howard T. Everson,et al.  Methodology Review: Statistical Approaches for Assessing Measurement Bias , 1993 .

[169]  Daniel T. Hickey,et al.  Balancing varied assessment functions to attain systemic validity: Three is the magic number , 2006 .

[170]  Douglas P. Newton,et al.  Do Teachers Support Causal Understanding through their Discourse when Teaching Primary Science , 2000 .

[171]  W. Spector,et al.  Impact of differential item functioning on age and gender differences in functional disability. , 2002, The journals of gerontology. Series B, Psychological sciences and social sciences.

[172]  Francis Tuerlinckx,et al.  Models for residual dependencies , 2004 .

[173]  Longitudinal Data Systems to Support Data-Informed Decision Making: A Tri-State Partnership Between Michigan, Minnesota, and Wisconsin , 2006 .

[174]  Denny Borsboom,et al.  The attack of the psychometricians , 2006, Psychometrika.

[175]  Alan E Hubbard,et al.  Do changes in spousal employment status lead to domestic violence? Insights from a prospective study in Bangalore, India. , 2010, Social science & medicine.

[176]  S. J. Sinclair,et al.  Activity Outcome Measurement for Postacute Care , 2004, Medical care.

[177]  M. Sable Pregnancy intentions may not be a useful measure for research on maternal and child health outcomes. , 1999, Family planning perspectives.

[178]  Klaas Sijtsma,et al.  Methodology Review: Evaluating Person Fit , 2001 .

[179]  Karsten Schnack,et al.  The action competence approach in environmental education , 1997 .

[180]  J. Fleishman,et al.  Differential item functioning and health assessment , 2007, Quality of Life Research.

[181]  S. Raudenbush,et al.  Comparing personal trajectories and drawing causal inferences from longitudinal data. , 2001, Annual review of psychology.

[182]  Susan E. Embretson,et al.  A multidimensional latent trait model for measuring learning and change , 1991 .

[183]  C. Bledsoe,et al.  Reproductive mishaps and Western contraception: an African challenge to fertility theory. , 1998 .

[184]  Daniel Bolt,et al.  DIFFERENTIAL ITEM FUNCTIONING: ITS MULTIDIMENSIONAL MODEL AND RESULTING SIBTEST DETECTION PROCEDURE , 1996 .

[185]  Yilmaz Saglam,et al.  Middle school students' beliefs about matter , 2005 .

[186]  R. Vandenberg,et al.  A Review and Synthesis of the Measurement Invariance Literature: Suggestions, Practices, and Recommendations for Organizational Research , 2000 .

[187]  Sun-Joo Cho,et al.  Explanatory Item Response Models , 2004 .

[188]  Terry A. Ackerman A Didactic Explanation of Item Bias, Item Impact, and Item Validity from a Multidimensional Perspective , 1992 .

[189]  A. Gopnik,et al.  The theory theory. , 1994 .

[190]  Joseph Krajcik,et al.  Supporting Students' Construction of Scientific Explanations by Fading Scaffolds in Instructional Materials , 2006 .

[191]  B. Lindsay,et al.  Semiparametric Estimation in the Rasch Model and Related Exponential Response Models, Including a Simple Latent Class Model for Item Analysis , 1991 .

[192]  Mark Wilson,et al.  Validating a Learning Progression in Mathematical Functions for College Readiness , 2011 .

[193]  Andrew Thomas,et al.  WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility , 2000, Stat. Comput..

[194]  K. Fischer A theory of cognitive development: The control and construction of hierarchies of skills. , 1980 .

[195]  Gerhard H. Fischer,et al.  An irt-based model for dichotomous longitudinal data , 1989 .

[196]  J. Linacre,et al.  Many-facet Rasch measurement , 1994 .

[197]  J. Bruner The Culture of Education , 1996 .

[198]  R. L. Lim Linking Results of Distinct Assessments , 1993 .

[199]  J. Singer,et al.  Applied Longitudinal Data Analysis , 2003 .

[200]  Mark Wilson,et al.  Unfair Treatment? The Case of Freedle, the SAT, and the Standardization Approach to Differential Item Functioning , 2010 .

[201]  David A. Schum,et al.  Evidence and inference for the intelligence analyst , 1987 .

[202]  C. Mitchell Dayton,et al.  Model Selection Information Criteria for Non-Nested Latent Class Models , 1997 .

[203]  Rebecca Zwick,et al.  Fair Game?: The Use of Standardized Admissions Tests in Higher Education , 2002 .

[204]  P. Black,et al.  Teachers developing assessment for learning: impact on student achievement , 2004 .

[205]  A Matter of Test Bias in Educational Policy Research: Bringing the Context into Picture by Investigating Sociological/Community Moderated (or Mediated) Test and Item Bias. , 2005 .

[206]  Mark Wilson Saltus: A psychometric model of discontinuity in cognitive development. , 1989 .

[207]  Erling B. Andersen,et al.  Sufficient statistics and latent trait models , 1977 .

[208]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[209]  P. Black,et al.  Developing the theory of formative assessment , 2009 .

[210]  J. D. McCarthy,et al.  Analysis of age effects in longitudinal studies of adolescent self-esteem. , 1982 .

[211]  C. Spearman The proof and measurement of association between two things. By C. Spearman, 1904. , 1987, The American journal of psychology.

[212]  William Stout,et al.  A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF , 1993 .

[213]  Luca Mari,et al.  Beyond the representational viewpoint: a new formalization of measurement , 2000 .

[214]  Jennifer A. Johnson-Hanks When the Future Decides , 2005, Current Anthropology.

[215]  F. Kaiser,et al.  Zur Angemessenheit selbstberichteten Verhaltens : eine Validitätsuntersuchung der Skala Allgemeninen Ökologischen Verhaltens (Accuracy of self-reports: Validating the general ecological behavior scale) , 2001 .

[216]  L. Kyriakides,et al.  The functional and developmental organization of cognitive developmental sequences. , 2006, The British journal of educational psychology.

[217]  B. Wright,et al.  Best test design , 1979 .

[218]  G. Rasch On General Laws and the Meaning of Measurement in Psychology , 1961 .

[219]  Mark Wilson,et al.  Constructing Measures: An Item Response Modeling Approach , 2004 .

[220]  P. Hewson,et al.  Accommodation of a scientific conception: Toward a theory of conceptual change , 1982 .

[221]  Steven J. Ingels,et al.  Base-Year to Fourth Follow-up Data File User's Manual. National Education Longitudinal Study of 1988. NCES 2002-323. , 2002 .

[222]  Mark Wilson,et al.  From Principles to Practice: An Embedded Assessment System , 2000 .

[223]  P. Schultz INCLUSION WITH NATURE: THE PSYCHOLOGY OF HUMAN-NATURE RELATIONS , 2002 .

[224]  Ross A. Thompson,et al.  The Development of the Person: Social Understanding, Relationships, Conscience, Self , 2007 .

[225]  Dana L. Kelly,et al.  International Association for the Evaluation of Educational Achievement , 1998 .

[226]  Mark R. Wilson The role of mathematical models in measurement: a perspective from psychometrics , 2011 .

[227]  B. Ainsworth,et al.  Rasch Calibration and Optimal Categorization of an Instrument Measuring Women's Exercise Perseverance and Barriers , 2001, Research Quarterly for Exercise and Sport.

[228]  Tianyou Wang,et al.  Precision of Warm’s Weighted Likelihood Estimates for a Polytomous Model in Computerized Adaptive Testing , 2001 .

[229]  Howard Wainer,et al.  Detection of differential item functioning using the parameters of item response models. , 1993 .

[230]  L. Sandvik,et al.  Reliability and applicability of the ICF in coding problems, resources and goals of persons with multiple injuries , 2008, Disability and rehabilitation.

[231]  S. Srinivasan Daughters or dowries? The changing nature of dowry practices in south India , 2005 .

[232]  Icek Ajzen,et al.  From Intentions to Actions: A Theory of Planned Behavior , 1985 .

[233]  K. Wellings,et al.  What is a 'planned' pregnancy? Empirical data from a British study. , 2002, Social science & medicine.

[234]  M. Chren,et al.  Testing and reducing skindex-29 using Rasch analysis: Skindex-17. , 2006, The Journal of investigative dermatology.

[235]  S. Rabe-Hesketh,et al.  Generalized multilevel structural equation modeling , 2004 .

[236]  Aaron Rogat,et al.  Learning Progressions in Science: An Evidence-Based Approach to Reform. CPRE Research Report # RR-63. , 2009 .

[237]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[238]  C. Kendall,et al.  Understanding pregnancy in a population of inner-city women in New Orleans--results of qualitative research. , 2005, Social science & medicine.

[239]  S S Stevens,et al.  On the Theory of Scales of Measurement. , 1946, Science.

[240]  Gerold Stucki,et al.  International Classification of Functioning, Disability, and Health (ICF): A Promising Framework and Classification for Rehabilitation Medicine , 2005, American journal of physical medicine & rehabilitation.

[241]  Gregory J. Cizek,et al.  Setting Performance Standards: Contemporary Methods , 2005 .

[242]  W. Sandoval,et al.  Explanation-Driven Inquiry: Integrating Conceptual and Epistemic Scaffolds for Scientific Inquiry , 2004 .

[243]  S. Hall,et al.  The construct of internalization: conceptualization, measurement, and prediction of smoking treatment outcome , 2005, Psychological Medicine.

[244]  M. Rosenberg Society and the adolescent self-image , 1966 .

[245]  Robert J. Mokken,et al.  A Theory and Procedure of Scale Analysis. , 1973 .

[246]  H. Kaiser,et al.  Directional statistical decisions. , 1960, Psychological review.

[247]  Akihito Kamata,et al.  Item Analysis by the Hierarchical Generalized Linear Model. , 2001 .

[248]  Benjamin S. Bloom,et al.  Taxonomy of Educational Objectives: The Classification of Educational Goals. , 1957 .

[249]  S. Chatterji,et al.  Comments from WHO for the Journal of Rehabilitation Medicine Special Supplement on ICF Core Sets. , 2004, Journal of rehabilitation medicine.

[250]  Shawn Y. Stevens,et al.  Developing a Hypothetical Multi-Dimensional Learning Progression for the Nature of Matter. , 2009 .

[251]  E. Grill,et al.  ICF Core Set for patients with neurological conditions in early post-acute rehabilitation facilities , 2005, Disability and rehabilitation.

[252]  G. Camilli,et al.  Variance Estimation for Differential Test Functioning Based on Mantel-Haenszel Statistics , 1997 .

[253]  F. Kaiser,et al.  Environmental Protection and Nature as Distinct Attitudinal Objects , 2013 .

[254]  Ibrahim A. Halloun,et al.  Common sense concepts about motion , 1985 .

[255]  A. Esacove Making sense of sex: rethinking intentionality , 2008, Culture, health & sexuality.

[256]  R. Mislevy,et al.  Psychometric Principles in Student Assessment. CSE Technical Report. , 2002 .

[257]  W. Zhu,et al.  Post-hoc Rasch analysis of optimal categorization of an ordered-response scale. , 1997, Journal of outcome measurement.

[258]  Michael J. Zieky,et al.  Practical questions in the use of DIF statistics in test development. , 1993 .

[259]  Wen-Chung Wang,et al.  Effects of Average Signed Area Between Two Item Characteristic Curves and Test Purification Procedures on the DIF Detection via the Mantel-Haenszel Method , 2004 .

[260]  E. B. Andersen,et al.  CONDITIONAL INFERENCE FOR MULTIPLE‐CHOICE QUESTIONNAIRES , 1973 .

[261]  Jeffrey A Douglas,et al.  Item-Bundle DIF Hypothesis Testing: Identifying Suspect Bundles and Assessing Their Differential Functioning , 1996 .

[262]  R. MacIntosh,et al.  Variance Estimation for Converting MIMIC Model Parameters to IRT Parameters in DIF Analysis , 2003 .

[263]  F. Kaiser,et al.  One for All? Connectedness to Nature, Inclusion of Nature, Environmental Identity, and Implicit Association with Nature , 2011 .

[264]  Philip Johnson Children's Understanding of Changes of State Involving the Gas State, Part 1: Boiling Water and the Particle Theory. , 1998 .

[265]  F. Kok,et al.  Item Bias and Test Multidimensionality , 1988 .

[266]  Alija Kulenović,et al.  Standards for Educational and Psychological Testing , 1999 .

[267]  L. Thurstone Attitudes Can Be Measured , 1928, American Journal of Sociology.

[268]  Kathleen Scalise,et al.  Differentiated e-learning: five approaches through instructional technology , 2007, Int. J. Learn. Technol..

[269]  F. Y. Edgeworth I.—The Statistics of Examinations , 1888 .

[270]  Nenad Kostanjsek,et al.  ICF Core Sets for obesity. , 2004, Journal of rehabilitation medicine.

[271]  R. Goldenberg,et al.  Changes in Intendedness During Pregnancy in a High-Risk Multiparous Population , 2000, Maternal and Child Health Journal.

[272]  Charles W. Anderson,et al.  Developing a multi-year learning progression for carbon cycling in socio-ecological systems , 2009 .

[273]  R. Glaser,et al.  Knowing What Students Know: The Science and Design of Educational Assessment , 2001 .

[274]  Rolf Langeheine,et al.  Latent Trait and Latent Class Models , 2013 .

[275]  K. Samuelsen Examining Differential Item Functioning From A Latent Class Perspective , 2005 .

[276]  H. Simon,et al.  Perception in chess , 1973 .

[277]  N. Schwarz Agenda 2000 — Social judgment and attitudes: warmer, more social, and less conscious , 2000 .

[278]  D. Andrich Application of a Psychometric Rating Model to Ordered Categories Which Are Scored with Successive Integers , 1978 .

[279]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[280]  Margaret Wu The development and application of a fit test for use with marginal maximum likelihood estimation and generalised item response models , 1997 .

[281]  Ference Marton,et al.  Phenomenography-a research approach to investigating different understandings of reality , 1986 .

[282]  Herbert Hoijtink,et al.  The Best of Both Worlds: Factor Analysis of Dichotomous Data Using Item Response Theory and Structural Equation Modeling , 2003 .

[283]  R. Freedle,et al.  A COMPARISON OF STRATEGIES USED BY BLACK AND WHITE STUDENTS IN SOLVING SAT VERBAL ANALOGIES USING A THINKING ALOUD METHOD AND A MATCHED PERCENTAGE-CORRECT DESIGN , 1987 .

[284]  D. Allen,et al.  Improving measurement in health education and health behavior research using item response modeling: introducing item response modeling. , 2006, Health education research.

[285]  Noreen M. Webb,et al.  Small-Group Reflections: Parallels Between Teacher Discourse and Student Behavior in Peer-Directed Groups , 2006 .

[286]  Jennifer A. Johnson-Hanks Demographic Transitions and Modernity , 2008 .

[287]  David M. Williams,et al.  Accounting for Statistical Artifacts in Item Bias Research , 1984 .

[288]  M. Linn,et al.  Scientific arguments as learning artifacts: designing for learning from the web with KIE , 2000 .

[289]  Kristian G. Olesen,et al.  HUGIN - A Shell for Building Bayesian Belief Universes for Expert Systems , 1989, IJCAI.

[290]  Paul De Boeck,et al.  The Random Weights Linear Logistic Test Model , 2002 .

[291]  Neil J. Dorans,et al.  Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude Test. , 1986 .

[292]  Meryl W. Bertenthal,et al.  Systems for state science assessment , 2005 .

[293]  Meryl W. Bertenthal,et al.  Uncommon Measures: Equivalence and Linkage among Educational Tests. , 1999 .

[294]  Howard Wainer,et al.  Item Clusters and Computerized Adaptive Testing: A Case for Testlets , 1987 .

[295]  Robert J. Mislevy,et al.  Test Theory Reconceived , 1996 .

[296]  S. Messick Meaning and Values in Test Validation: The Science and Ethics of Assessment , 1989 .

[297]  M. Meulders,et al.  Cross-Classification Multilevel Logistic Models in Psychometrics , 2003 .

[298]  B. Edwards,et al.  Internal consistency and validity of the Stroke Impact Scale 2.0 (SIS 2.0) and SIS-16 in an Australian sample , 2003, Quality of Life Research.

[299]  N. Brown,et al.  A Model of Cognition: The Missing Cornerstone of Assessment , 2011 .

[300]  Scott Marion,et al.  Moving toward a Comprehensive Assessment System: A Framework for Considering Interim Assessments. , 2009 .

[301]  Florian G. Kaiser,et al.  Goal-directed conservation behavior: the specific composition of a general performance , 2004 .

[302]  R. Shavelson,et al.  On the evaluation of systemic science education reform: Searching for instructional sensitivity , 2002 .

[303]  Responding to Claims of Misrepresentation , 2010 .

[304]  E. Kofsky A SCALOGRAM STUDY OF CLASSIFICATORY DEVELOPMENT , 1966 .

[305]  J. Scheuneman A METHOD OF ASSESSING BIAS IN TEST ITEMS , 1979 .

[306]  Stella Vosniadou,et al.  Mental Models of the Day/Night Cycle , 1994, Cogn. Sci..

[307]  Kathleen Scalise,et al.  Mapping student understanding in chemistry: The Perspectives of Chemists , 2009 .

[308]  P. Holland,et al.  DIF DETECTION AND DESCRIPTION: MANTEL‐HAENSZEL AND STANDARDIZATION1,2 , 1992 .

[309]  C. Kendall,et al.  Measuring factors underlying intendedness of women's first and later pregnancies. , 2004, Perspectives on sexual and reproductive health.

[310]  David Thissen,et al.  Beyond group-mean differences: The concept of item bias. , 1986 .

[311]  Cees A. W. Glas,et al.  DETECTION OF DIFFERENTIAL ITEM FUNCTIONING USING LAGRANGE MULTIPLIER TESTS , 1996 .

[312]  D. Symmons,et al.  ICF Core Sets for rheumatoid arthritis. , 2004, Journal of rehabilitation medicine.

[313]  Mark R. Wilson,et al.  Improving assessment evidence in e-learning products: some solutions for reliability , 2010, Int. J. Learn. Technol..

[314]  M. Davison,et al.  Modeling Individual Differences in Numerical Reasoning Speed as a Random Effect of Response Time Limits , 2011 .

[315]  J. Biggs,et al.  Teaching For Quality Learning At University , 1999 .

[316]  J. Bruner The act of discovery. , 1961 .

[317]  Decomposition of a Rasch partial credit item into independent binary and indecomposable trinary items , 1996 .

[318]  J. M. Hines,et al.  Analysis and synthesis of research on responsible environmental behavior: A meta-analysis. , 1987 .

[319]  Derek C. Briggs,et al.  Diagnostic Assessment With Ordered Multiple-Choice Items , 2006 .

[320]  Robert J. Mislevy,et al.  Technology Supports for Assessment Design , 2010 .

[321]  Gene V. Glass,et al.  Standards and Criteria* , 1978, Journal of MultiDisciplinary Evaluation.

[322]  Franz Emanuel Weinert,et al.  Concept of competence: A conceptual clarification , 2001 .

[323]  Sun-Joo Cho,et al.  A Comparison of Item Calibration Procedures in the Presence of Test Speededness. , 2012 .

[324]  G. Karabatsos,et al.  The Rasch model, additive conjoint measurement, and new models of probabilistic measurement theory. , 2001, Journal of applied measurement.

[325]  Mark R. Wilson,et al.  Running Head: Measurement at the Knowledge Level. A Theory of the Measurement of Knowledge Content, Access, and Learning. , 1996 .

[326]  Raymond J. Adams,et al.  Multilevel Item Response Models: An Approach to Errors in Variables Regression , 1997 .

[327]  Mark Wilson,et al.  Improving measurement in health education and health behavior research using item response modeling: comparison with the classical test theory approach. , 2006, Health education research.

[328]  Gregory J. Cizek,et al.  Reconsidering Standards and Criteria , 1993 .

[329]  Practical Formulations of the Latent Growth Item Response Model. , 2010 .

[330]  A. Glasier,et al.  Unintended pregnancy and use of emergency contraception among a large cohort of women attending for antenatal care or abortion in Scotland , 2006, The Lancet.

[331]  H. Akaike A new look at the statistical model identification , 1974 .

[332]  R. Mislevy Linking Educational Assessments: Concepts, Issues, Methods, and Prospects. , 1992 .

[333]  Ben Kelcey,et al.  How and when does complex reasoning occur? Empirically driven development of a learning progression focused on complex reasoning about biodiversity , 2009 .

[334]  A. Satorra,et al.  Complex Sample Data in Structural Equation Modeling , 1995 .

[335]  Juliet Popper Shaffer Bidirectional Unbiased Procedures , 1974 .

[336]  Allan S. Cohen,et al.  Threats to the Valid Use of Assessments , 1996 .

[337]  T. K. Roy,et al.  Do current measurement approaches underestimate levels of unwanted childbearing? Evidence from rural India , 2006, Population studies.

[338]  S. Natasha Beretvas,et al.  Longitudinal Rasch Modeling in the Context of Psychotherapy Outcomes Assessment , 2006 .

[339]  Kate Wall,et al.  Interactive whole class teaching in the National Literacy and Numercy Strategies , 2004 .

[340]  L. Guttman A basis for scaling qualitative data. , 1944 .

[341]  R. Linn Educational measurement, 3rd ed. , 1989 .

[342]  S J Jejeebhoy,et al.  Adolescent sexual and reproductive behavior: a review of the evidence from India. , 1998, Social science & medicine.

[343]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[344]  F. Samejima Estimation of latent ability using a response pattern of graded scores , 1968 .

[345]  Russell G. Almond,et al.  A Four-Process Architecture for Assessment Delivery, with Connections to Assessment Design , 2002 .

[346]  Derek C. Briggs,et al.  An introduction to multidimensional measurement using Rasch models. , 2003, Journal of applied measurement.

[347]  E. Maris Estimating multiple classification latent class models , 1999 .

[348]  Carol M. Woods Please Scroll down for Article Multivariate Behavioral Research Evaluation of Mimic-model Methods for Dif Testing with Comparison to Two- Group Analysis , 2022 .

[349]  Mark Wilson,et al.  Environmental knowledge and conservation behavior : exploring prevalence and structure in a representative sample , 2004 .

[350]  Allan S. Cohen,et al.  Item Parameter Estimation Under Conditions of Test Speededness: Application of a Mixture Rasch Model With Ordinal Constraints , 2002 .

[351]  R. Siegler Three aspects of cognitive development , 1976, Cognitive Psychology.

[352]  K. Wellings,et al.  Conceptualisation, development, and evaluation of a measure of unplanned pregnancy , 2004, Journal of Epidemiology and Community Health.

[353]  F. Song,et al.  Bmc Musculoskeletal Disorders a Systematic Review of Outcomes Assessed in Randomized Controlled Trials of Surgical Interventions for Carpal Tunnel Syndrome Using the International Classification of Functioning, Disability and Health (icf) as a Reference Tool , 2022 .

[354]  Cynthia G. Parshall,et al.  Innovative Item Types for Computerized Testing , 2000 .

[355]  Terry E. Duncan,et al.  An Introduction to Latent Variable Growth Curve Modeling: Concepts, Issues, and Application, Second Edition , 1999 .

[356]  Sally Brown,et al.  Assessment for Learning , 2005 .

[357]  D. Olson,et al.  The Handbook of education and human development : new models of learning, teaching, and schooling , 1996 .

[358]  Howard Wainer,et al.  Was It Ethnic and Social-Class Bias or Statistical Artifact? Logical and Empirical Evidence against Freedle's Method for Reestimating SAT Scores , 2005 .

[359]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[360]  R. Almond,et al.  Focus Article: On the Structure of Educational Assessments , 2003 .

[361]  Jürgen Rost,et al.  Rasch Models in Latent Classes: An Integration of Two Approaches to Item Analysis , 1990 .

[362]  Raymond J. Adams,et al.  The Multidimensional Random Coefficients Multinomial Logit Model , 1997 .

[363]  Philip Johnson,et al.  Children's understanding of changes of state involving the gas state, Part 2: Evaporation and condensation below boiling point , 1998 .

[364]  J. Muñiz,et al.  Utility of the Mantel-Haenszel Procedure for Detecting Differential Item Functioning in Small Samples , 2004 .

[365]  A. Kollmuss,et al.  Mind the Gap: Why do people act environmentally and what are the barriers to pro-environmental behavior? , 2002 .

[366]  D. Streiner,et al.  Health Measurement Scales: A practical guide to thier development and use , 1989 .

[367]  J. S. Long,et al.  Regression Models for Categorical and Limited Dependent Variables , 1997 .

[368]  T. Joyce,et al.  On the validity of retrospective assessments of pregnancy intention , 2002, Demography.

[369]  G Grimby,et al.  Scoring alternatives for FIM in neurological disorders applying Rasch analysis , 2005, Acta neurologica Scandinavica.

[370]  H. Hoijtink Linear and repeated measures models for the person parameters , 1995 .

[371]  Han L. J. van der Maas,et al.  Towards better computational models of the balance scale task: A reply to Shultz and Takane , 2007, Cognition.

[372]  G. Masters,et al.  Rating scale analysis , 1982 .

[373]  Melvin R. Novick,et al.  Some latent train models and their use in inferring an examinee's ability , 1966 .

[374]  L. Cronbach Essentials of psychological testing , 1960 .

[375]  Allan S. Cohen,et al.  A Speeded Item Response Model with Gradual Process Change , 2008 .

[376]  J. Gill Hierarchical Linear Models , 2005 .

[377]  L. Chawla,et al.  Education for strategic environmental behavior , 2007 .

[378]  Jean Piaget,et al.  Epistemology and Psychology of Functions , 1977 .

[379]  J. Fox,et al.  Bayesian estimation of a multilevel IRT model using gibbs sampling , 2001 .

[380]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[381]  C. Pappas,et al.  Exploring the role of intertextuality in concept construction: Urban second graders make sense of evaporation, boiling, and condensation , 2006 .

[382]  Pamela Joy Mulhall,et al.  What is the purpose of this experiment? Or can students learn something from doing experiments? , 2000 .

[383]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[384]  C. Dweck Self-Theories: Their Role in Motivation, Personality, and Development. Essays in Social Psychology. , 1999 .

[385]  William Stout,et al.  A Multidimensionality-Based DIF Analysis Paradigm , 1996 .

[386]  P. Black,et al.  Assessment and Classroom Learning , 1998 .

[387]  T. Bedirhan Üstün,et al.  Application of the International Classification of Functioning, Disability and Health (ICF) in clinical practice , 2002, Disability and rehabilitation.

[388]  Igal Galili,et al.  Stages of children's views about evaporation , 1994 .

[389]  I. Autti-Rämö,et al.  Effectiveness of physical therapy interventions for children with cerebral palsy: A systematic review , 2008, BMC pediatrics.

[390]  Bengt Muthén,et al.  Multiple Group IRT Modeling: Applications to Item Bias Analysis , 1985 .

[391]  B. Junker,et al.  Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory , 2001 .

[392]  L. Skovgaard NONLINEAR MODELS FOR REPEATED MEASUREMENT DATA. , 1996 .

[393]  M. Moos,et al.  Measuring the intensity of pregnancy planning effort. , 2003, Paediatric and perinatal epidemiology.

[394]  L. Shepard,et al.  Methods for Identifying Biased Test Items , 1994 .

[395]  P. Boeck,et al.  Explanatory item response models : a generalized linear and nonlinear approach , 2004 .

[396]  Frans J. Oort,et al.  Simulation study of item bias detection with restricted factor analysis , 1998 .

[397]  R. Glaser The Reemergence of Learning Theory within Instructional Research. , 1990 .

[398]  R. Mislevy,et al.  Marginal maximum likelihood estimation for a psychometric model of discontinuous development , 1996 .

[399]  L. Shulman Knowledge and Teaching: Foundations of the New Reform , 1987 .

[400]  A. Cohen,et al.  The Role of Extended Time and Item Content on a High–Stakes Mathematics Test , 2005 .

[401]  Edward Haksing Ip,et al.  Empirically indistinguishable multidimensional IRT and locally dependent unidimensional item response models. , 2010, The British journal of mathematical and statistical psychology.

[402]  M. Browne,et al.  Alternative Ways of Assessing Model Fit , 1992 .

[403]  Marcel A. Croon,et al.  Latent class analysis with ordered latent classe , 1990 .

[404]  F. Mayer,et al.  The connectedness to nature scale: A measure of individuals’ feeling in community with nature ☆ , 2004 .

[405]  Ronald K. Hambleton,et al.  Small Sample Studies to Detect Flaws in Item Translations , 2001 .

[406]  P. Tugwell,et al.  The World Health Organisation International Classification of Functioning, Disability and Health: a conceptual model and interface for the OMERACT process. , 2007, The Journal of rheumatology.

[407]  Roger E. Millsap,et al.  On the misuse of manifest variables in the detection of measurement bias , 1992 .

[408]  Howard Wainer,et al.  The Rasch Model as Additive Conjoint Measurement , 1979 .

[409]  Raymond J. Adams,et al.  Charting of Student Progress , 1999 .

[410]  Kathleen A. O'Neill,et al.  Item and test characteristics that are associated with differential item functioning. , 1993 .

[411]  D. Borsboom,et al.  The Theoretical Status of Latent Variables , 2003 .

[412]  Person regression models , 2004 .

[413]  Margaret Wu,et al.  ACER conquest: generalised item response modelling software , 1998 .

[414]  Saul Geiser,et al.  UC and the SAT: Predictive Validity and Differential Impact of the SAT I and SAT II at the University of California , 2001 .

[415]  Wim Van Den Noortgate,et al.  Assessing and Explaining Differential Item Functioning Using Logistic Mixed Models , 2005 .

[416]  R. Duschl,et al.  Strategies and Challenges to Changing the Focus of Assessment and Instruction in Science Classrooms. , 1997 .

[417]  S. Embretson,et al.  Item response theory for psychologists , 2000 .

[418]  Cynthia G. Parshall,et al.  Exact Versus Asymptotic Mantel-Haenszel DIF Statistics: A Comparison of Performance Under Small-Sample Conditions , 1995 .

[419]  F. Kaiser,et al.  Competence Formation in Environmental Education: Advancing Ecology-Specific Rather Than General Abilities , 2008 .

[420]  A. Demetriou,et al.  The person's conception of the structures of developing intellect: early adolescence to middle age. , 1989, Genetic, social, and general psychology monographs.

[421]  J. Hagenaars,et al.  Applied Latent Class Analysis , 2003 .

[422]  D. Kuhn Science as argument : Implications for teaching and learning scientific thinking , 1993 .

[423]  R. Almond,et al.  A BRIEF INTRODUCTION TO EVIDENCE-CENTERED DESIGN , 2003 .