Some Comments on Representing Construct Levels in Psychometric Models

This paper is concerned with one of the steps necessary to trace the connection between the substantive theory that serves as a basis for an assessment and the mathematical models that are used to analyze and rate student responses. We are interested in exploring this connection in the context of hypothesized variables that (a) have multiple ordered levels, (b) have been assessed with polytomous items that are meant to capture the aforementioned ordered performance levels, but (c) are modeled as continuous rather than ordinal. We present a straightforward method for estimating interpretable level boundaries when using the partial credit model. We then introduce graphical methods to evaluate the relationship between the levels as estimated by the model and the performance levels originally hypothesized by the theory. We believe that this kind of procedure can help practitioners make meaningful interpretations and provide more accurate diagnostic information to respondents in general.

[1]  E. Muraki A Generalized Partial Credit Model: Application of an EM Algorithm , 1992 .

[2]  K. Wellings,et al.  Conceptualisation, development, and evaluation of a measure of unplanned pregnancy , 2004, Journal of Epidemiology and Community Health.

[3]  J. Tukey The Philosophy of Multiple Comparisons , 1991 .

[4]  H. Hoijtink Linear and repeated measures models for the person parameters , 1995 .

[5]  B. Wright,et al.  Best test design , 1979 .

[6]  Charles Lewis,et al.  A Nonparametric Approach to the Analysis of Dichotomous Item Responses , 1982 .

[7]  Howard T. Everson,et al.  Methodology Review: Statistical Approaches for Assessing Measurement Bias , 1993 .

[8]  C. Dweck Self-Theories: Their Role in Motivation, Personality, and Development. Essays in Social Psychology. , 1999 .

[9]  Anthony S. Travis,et al.  Children's Views Concerning Phase Changes. , 1991 .

[10]  F. Kaiser,et al.  Zur Angemessenheit selbstberichteten Verhaltens : eine Validitätsuntersuchung der Skala Allgemeninen Ökologischen Verhaltens (Accuracy of self-reports: Validating the general ecological behavior scale) , 2001 .

[11]  F. Kaiser,et al.  Reviving Campbell’s Paradigm for Attitude Research , 2010, Personality and social psychology review : an official journal of the Society for Personality and Social Psychology, Inc.

[12]  W. Stout,et al.  An Item Response Theory Model for Test Bias. , 1991 .

[13]  Gregory Camilli,et al.  A Conceptual Analysis of Differential Item Functioning in Terms of a Multidimensional Item Response Model , 1992 .

[14]  M. J. Kolen Linking Assessments: Concept and History , 2004 .

[15]  G. Masters,et al.  Rating scale analysis , 1982 .

[16]  Melvin R. Novick,et al.  Some latent train models and their use in inferring an examinee's ability , 1966 .

[17]  Daniel T. Hickey,et al.  Balancing varied assessment functions to attain systemic validity: Three is the magic number , 2006 .

[18]  F. Kaiser,et al.  Environmental Protection and Nature as Distinct Attitudinal Objects , 2013 .

[19]  Tianyou Wang,et al.  Precision of Warm’s Weighted Likelihood Estimates for a Polytomous Model in Computerized Adaptive Testing , 2001 .

[20]  C. Rocca,et al.  Challenging assumptions about women's empowerment: social and economic resources and domestic violence among young married women in urban South India. , 2009, International journal of epidemiology.

[21]  Paul Kline,et al.  A Handbook of Test Construction : Introduction to Psychometric Design , 1987 .

[22]  P. Black,et al.  Assessment and Classroom Learning , 1998 .

[23]  L. Shepard,et al.  Methods for Identifying Biased Test Items , 1994 .

[24]  S J Jejeebhoy,et al.  Adolescent sexual and reproductive behavior: a review of the evidence from India. , 1998, Social science & medicine.

[25]  Dana L. Kelly,et al.  International Association for the Evaluation of Educational Achievement , 1998 .

[26]  Mark Wilson,et al.  Environmental knowledge and conservation behavior : exploring prevalence and structure in a representative sample , 2004 .

[27]  G. Masters A rasch model for partial credit scoring , 1982 .

[28]  M. Browne,et al.  Alternative Ways of Assessing Model Fit , 1992 .

[29]  P. Kline The New Psychometrics: Science, Psychology and Measurement , 1998 .

[30]  G. Rasch On General Laws and the Meaning of Measurement in Psychology , 1961 .

[31]  H. Akaike A new look at the statistical model identification , 1974 .

[32]  Nenad Kostanjsek,et al.  ICF Core Sets for stroke. , 2004, Journal of rehabilitation medicine.

[33]  R. L. Lim Linking Results of Distinct Assessments , 1993 .

[34]  B. Hanson Uniform DIF and DIF Defined by Differences in Item Response Functions , 1998 .

[35]  Ross A. Thompson,et al.  The Development of the Person: Social Understanding, Relationships, Conscience, Self , 2007 .

[36]  S S Stevens,et al.  On the Theory of Scales of Measurement. , 1946, Science.

[37]  Peter Congdon,et al.  Applied Bayesian Modelling , 2003 .

[38]  T. C. Oshima,et al.  Multidimensionality and Item Bias in Item Response Theory , 1992 .

[39]  Derek C. Briggs,et al.  Generalizability in Item Response Modeling , 2007 .

[40]  Derek C. Briggs,et al.  The Impact of Vertical Scaling Decisions on Growth Interpretations. , 2009 .

[41]  David Foster,et al.  The Mathematics Assessment Collaborative: Performance Testing to Improve Instruction , 2004 .

[42]  W. Spector,et al.  Impact of differential item functioning on age and gender differences in functional disability. , 2002, The journals of gerontology. Series B, Psychological sciences and social sciences.

[43]  Taking Chances: Abortion and the Decision not to Contracept. , 1977 .

[44]  Kevin F. Collis,et al.  Evaluating the Quality of Learning: The SOLO Taxonomy , 1977 .

[45]  L. Resnick,et al.  Assessing the Thinking Curriculum: New Tools for Educational Reform , 1992 .

[46]  C. Mitchell Dayton,et al.  Model Selection Information Criteria for Non-Nested Latent Class Models , 1997 .

[47]  Andrew Thomas,et al.  WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility , 2000, Stat. Comput..

[48]  Ann L. Brown,et al.  How people learn: Brain, mind, experience, and school. , 1999 .

[49]  H. Wainer,et al.  Differential Item Functioning. , 1994 .

[50]  C. Spearman,et al.  Demonstration of Formulae for True Measurement of Correlation , 1907 .

[51]  C. Dweck,et al.  A social-cognitive approach to motivation and personality , 1988 .

[52]  Carole A. Bleistein,et al.  FACTORS AFFECTING DIFFERENTIAL ITEM FUNCTIONING FOR BLACK EXAMINEES ON SCHOLASTIC APTITUDE TEST ANALOGY ITEMS1 , 1987 .

[53]  Allan S. Cohen,et al.  Threats to the Valid Use of Assessments , 1996 .

[54]  Kathleen Scalise,et al.  Assessment to improve learning in higher education: The BEAR Assessment System , 2006 .

[55]  Han L. J. van der Maas,et al.  Towards better computational models of the balance scale task: A reply to Shultz and Takane , 2007, Cognition.

[56]  Kristian G. Olesen,et al.  HUGIN - A Shell for Building Bayesian Belief Universes for Expert Systems , 1989, IJCAI.

[57]  Michael J. Zieky,et al.  Practical questions in the use of DIF statistics in test development. , 1993 .

[58]  Philip Johnson,et al.  Children's understanding of changes of state involving the gas state, Part 2: Evaporation and condensation below boiling point , 1998 .

[59]  R. Almond,et al.  Making Sense of Data From Complex Assessments , 2002 .

[60]  Jean Piaget,et al.  Epistemology and Psychology of Functions , 1977 .

[61]  J. Fox,et al.  Bayesian estimation of a multilevel IRT model using gibbs sampling , 2001 .

[62]  N. Breslow,et al.  Approximate inference in generalized linear mixed models , 1993 .

[63]  Frans J. Oort,et al.  Simulation study of item bias detection with restricted factor analysis , 1998 .

[64]  Paul J. Feltovich,et al.  Categorization and Representation of Physics Problems by Experts and Novices , 1981, Cogn. Sci..

[65]  Responding to Claims of Misrepresentation , 2010 .

[66]  E. Grill,et al.  ICF Core Set for patients with neurological conditions in early post-acute rehabilitation facilities , 2005, Disability and rehabilitation.

[67]  E. Grill,et al.  ICF Core Sets development for the acute hospital and early post-acute rehabilitation facilities , 2005, Disability and rehabilitation.

[68]  Jennifer Caroline Greene,et al.  Crafting mixed‐method evaluation designs , 1997 .

[69]  W. Damon,et al.  Self-understanding in childhood and adolescence , 1988 .

[70]  A. Jette,et al.  Are the ICF Activity and Participation dimensions distinct? , 2003, Journal of rehabilitation medicine.

[71]  J. Gill Hierarchical Linear Models , 2005 .

[72]  R. Land Threshold Concepts and Troublesome Knowledge (1): linkages to ways of thinking and practising within the disciplines , 2003 .

[73]  Margaret Wu,et al.  ACER conquest: generalised item response modelling software , 1998 .

[74]  W. Zhu,et al.  Post-hoc Rasch analysis of optimal categorization of an ordered-response scale. , 1997, Journal of outcome measurement.

[75]  Gerhard H. Fischer,et al.  Derivations of the Rasch Model , 1995 .

[76]  D. Streiner,et al.  Health Measurement Scales: A practical guide to thier development and use , 1989 .

[77]  A. Tversky,et al.  Foundations of Measurement, Vol. I: Additive and Polynomial Representations , 1991 .

[78]  M. Rosenberg Society and the adolescent self-image , 1966 .

[79]  Robert J. Mokken,et al.  A Theory and Procedure of Scale Analysis. , 1973 .

[80]  Laura M. Stapleton,et al.  Differential Item Functioning: A Mixture Distribution Conceptualization , 2002 .

[81]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[82]  Gregory J. Cizek,et al.  Reconsidering Standards and Criteria , 1993 .

[83]  Neil J. Dorans,et al.  Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude Test. , 1986 .

[84]  R. Vandenberg,et al.  A Review and Synthesis of the Measurement Invariance Literature: Suggestions, Practices, and Recommendations for Organizational Research , 2000 .

[85]  B. Ainsworth,et al.  Rasch Calibration and Optimal Categorization of an Instrument Measuring Women's Exercise Perseverance and Barriers , 2001, Research Quarterly for Exercise and Sport.

[86]  L. Shulman Knowledge and Teaching: Foundations of the New Reform , 1987 .

[87]  Sylvia Frühwirth-Schnatter,et al.  Finite Mixture and Markov Switching Models , 2006 .

[88]  B. Tabachnick,et al.  Using multivariate statistics, 5th ed. , 2007 .

[89]  H. Schweingruber,et al.  TAKING SCIENCE TO SCHOOL: LEARNING AND TEACHING SCIENCE IN GRADES K-8 , 2007 .

[90]  J. M. Hines,et al.  Analysis and synthesis of research on responsible environmental behavior: A meta-analysis. , 1987 .

[91]  K. Draney,et al.  Investigating the saltus model as a tool for setting standards , 2011 .

[92]  L. Kyriakides,et al.  The functional and developmental organization of cognitive developmental sequences. , 2006, The British journal of educational psychology.

[93]  M. Moos,et al.  Pregnant women's perspectives on intendedness of pregnancy. , 1997, Women's health issues : official publication of the Jacobs Institute of Women's Health.

[94]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[95]  M. Sable Pregnancy intentions may not be a useful measure for research on maternal and child health outcomes. , 1999, Family planning perspectives.

[96]  S. Messick Meaning and Values in Test Validation: The Science and Ethics of Assessment , 1989 .

[97]  P. Black,et al.  Developing the theory of formative assessment , 2009 .

[98]  D. Andrich A rating formulation for ordered response categories , 1978 .

[99]  Joseph Krajcik,et al.  Supporting Students' Construction of Scientific Explanations by Fading Scaffolds in Instructional Materials , 2006 .

[100]  F. Marton Phenomenography — Describing conceptions of the world around us , 1981 .

[101]  A. Demetriou,et al.  The person's conception of the structures of developing intellect: early adolescence to middle age. , 1989, Genetic, social, and general psychology monographs.

[102]  Mark R. Wilson,et al.  Improving assessment evidence in e-learning products: some solutions for reliability , 2010, Int. J. Learn. Technol..

[103]  J. D. McCarthy,et al.  Analysis of age effects in longitudinal studies of adolescent self-esteem. , 1982 .

[104]  Allan S. Cohen,et al.  A Mixture Model Analysis of Differential Item Functioning , 2005 .

[105]  A. Kollmuss,et al.  Mind the Gap: Why do people act environmentally and what are the barriers to pro-environmental behavior? , 2002 .

[106]  M. Kane Current Concerns in Validity Theory , 2001 .

[107]  Mark R. Wilson,et al.  The Ordered artition Model: An Extension of the Partial Credit Model , 1992 .

[108]  C. Kendall,et al.  Understanding pregnancy in a population of inner-city women in New Orleans--results of qualitative research. , 2005, Social science & medicine.

[109]  Eric T. Bradlow,et al.  A Bayesian random effects model for testlets , 1999 .

[110]  W. Sandoval,et al.  Explanation-Driven Inquiry: Integrating Conceptual and Epistemic Scaffolds for Scientific Inquiry , 2004 .

[111]  R. Butler ENHANCING AND UNDERMINING INTRINSIC MOTIVATION: THE EFFECTS OF TASK‐INVOLVING AND EGO‐INVOLVING EVALUATION ON INTEREST AND PERFORMANCE , 1988 .

[112]  C. Vlek,et al.  Measurement and Determinants of Environmentally Significant Consumer Behavior , 2002 .

[113]  R. Hambleton,et al.  Fundamentals of Item Response Theory , 1991 .

[114]  Luca Mari,et al.  Beyond the representational viewpoint: a new formalization of measurement , 2000 .

[115]  Longitudinal Data Systems to Support Data-Informed Decision Making: A Tri-State Partnership Between Michigan, Minnesota, and Wisconsin , 2006 .

[116]  Russell Tytler,et al.  A Longitudinal Study of Children’s Developing Knowledge and Reasoning in Science , 2005 .

[117]  Ference Marton,et al.  Phenomenography-a research approach to investigating different understandings of reality , 1986 .

[118]  Jeremy Hodgen,et al.  Validity in teachers’ summative assessments , 2010 .

[119]  Allan S. Cohen,et al.  A Speeded Item Response Model with Gradual Process Change , 2008 .

[120]  Douglas P. Newton,et al.  Do Teachers Support Causal Understanding through their Discourse when Teaching Primary Science , 2000 .

[121]  R. MacCallum,et al.  Applications of structural equation modeling in psychological research. , 2000, Annual review of psychology.

[122]  J. Singer,et al.  Applied Longitudinal Data Analysis , 2003 .

[123]  Howard Wainer,et al.  Detection of differential item functioning using the parameters of item response models. , 1993 .

[124]  Gerhard H. Fischer,et al.  An irt-based model for dichotomous longitudinal data , 1989 .

[125]  Machteld Hoskens,et al.  The Rater Bundle Model , 2001 .

[126]  K. Wellings,et al.  What is a 'planned' pregnancy? Empirical data from a British study. , 2002, Social science & medicine.

[127]  Aaron Rogat,et al.  Learning Progressions in Science: An Evidence-Based Approach to Reform. CPRE Research Report # RR-63. , 2009 .

[128]  Menucha Birenbaum,et al.  On the Stability of Students’Rules of Operation for Solving Arithmetic Problems , 1989 .

[129]  Cynthia G. Parshall,et al.  Innovative Item Types for Computerized Testing , 2000 .

[130]  Robert W. Lissitz,et al.  The concept of validity : revisions, new directions, and applications , 2009 .

[131]  Wen-Chung Wang,et al.  The Rasch Testlet Model , 2005 .

[132]  Paul De Boeck,et al.  The Random Weights Linear Logistic Test Model , 2002 .

[133]  Mark R. Wilson,et al.  Marginal Maximum Likelihood Estimation for the Ordered Partition Model , 1993 .

[134]  E. Kofsky A SCALOGRAM STUDY OF CLASSIFICATORY DEVELOPMENT , 1966 .

[135]  J. Stanford,et al.  Exploring the concepts of intended, planned, and wanted pregnancy. , 1999, The Journal of family practice.

[136]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[137]  Allan S. Cohen,et al.  A Multilevel Mixture IRT Model With an Application to DIF , 2010 .

[138]  C. Sherbourne,et al.  The MOS 36-Item Short-Form Health Survey (SF-36) , 1992 .

[139]  R. Zwick,et al.  Assessment of Differential Item Functioning for Performance Tasks , 1993 .

[140]  J. Steenkamp,et al.  Assessing Measurement Invariance in Cross-National Consumer Research , 1998 .

[141]  L. Chawla,et al.  Education for strategic environmental behavior , 2007 .

[142]  E. Maris Estimating multiple classification latent class models , 1999 .

[143]  H. Simon,et al.  Perception in chess , 1973 .

[144]  N. Schwarz Agenda 2000 — Social judgment and attitudes: warmer, more social, and less conscious , 2000 .

[145]  Robert L. Leahy,et al.  The Construction of the Self: A Developmental Perspective , 2001, Journal of Cognitive Psychotherapy.

[146]  F. Kaiser,et al.  Ecological behavior's dependency on different forms of knowledge , 2003 .

[147]  S. Srinivasan Daughters or dowries? The changing nature of dowry practices in south India , 2005 .

[148]  C. Spearman The proof and measurement of association between two things. By C. Spearman, 1904. , 1987, The American journal of psychology.

[149]  L. Sandvik,et al.  Reliability and applicability of the ICF in coding problems, resources and goals of persons with multiple injuries , 2008, Disability and rehabilitation.

[150]  Michael Shayer,et al.  Towards a science of science teaching , 1981 .

[151]  G. Karabatsos,et al.  The Rasch model, additive conjoint measurement, and new models of probabilistic measurement theory. , 2001, Journal of applied measurement.

[152]  Brenda R. J. Jansen,et al.  The development of children's rule use on the balance scale task. , 2002, Journal of experimental child psychology.

[153]  W. Meredith Measurement invariance, factor analysis and factorial invariance , 1993 .

[154]  Brian F. Patterson,et al.  Differential Validity and Prediction of the SAT , 2008 .

[155]  C. Pappas,et al.  Exploring the role of intertextuality in concept construction: Urban second graders make sense of evaporation, boiling, and condensation , 2006 .

[156]  Denny Borsboom,et al.  The attack of the psychometricians , 2006, Psychometrika.

[157]  Raymond J. Adams,et al.  Charting of Student Progress , 1999 .

[158]  M. Linn,et al.  Scientific arguments as learning artifacts: designing for learning from the web with KIE , 2000 .

[159]  Florian G. Kaiser,et al.  A General Measure of Ecological Behavior1 , 1998 .

[160]  P. Holland,et al.  DIF DETECTION AND DESCRIPTION: MANTEL‐HAENSZEL AND STANDARDIZATION1,2 , 1992 .

[161]  A. Su,et al.  The National Council of Teachers of Mathematics , 1932, The Mathematical Gazette.

[162]  Mark Wilson,et al.  Measuring Progressions: Assessment Structures Underlying a Learning Progression , 2009 .

[163]  Howard Wainer,et al.  Was It Ethnic and Social-Class Bias or Statistical Artifact? Logical and Empirical Evidence against Freedle's Method for Reestimating SAT Scores , 2005 .

[164]  J. Bruner The act of discovery. , 1961 .

[165]  H. J. Rogers,et al.  Guessing in Multiple Choice Tests , 1999 .

[166]  Jennifer A. Johnson-Hanks When the Future Decides , 2005, Current Anthropology.

[167]  Susan E. Embretson,et al.  A multidimensional latent trait model for measuring learning and change , 1991 .

[168]  Allan S. Cohen,et al.  A Method for Maintaining Scale Stability in the Presence of Test Speededness , 2003 .

[169]  Sophia Rabe-Hesketh,et al.  Generalized latent variable models: multilevel, longitudinal, and structural equation models , 2004 .

[170]  P. Schultz INCLUSION WITH NATURE: THE PSYCHOLOGY OF HUMAN-NATURE RELATIONS , 2002 .

[171]  Nenad Kostanjsek,et al.  ICF Core Sets for obesity. , 2004, Journal of rehabilitation medicine.

[172]  Meryl W. Bertenthal,et al.  Systems for state science assessment , 2005 .

[173]  Mark Wilson,et al.  From Principles to Practice: An Embedded Assessment System , 2000 .

[174]  K. Fischer A theory of cognitive development: The control and construction of hierarchies of skills. , 1980 .

[175]  D. Symmons,et al.  ICF Core Sets for rheumatoid arthritis. , 2004, Journal of rehabilitation medicine.

[176]  G. H. Fischer,et al.  The Derivation of Polytomous Rasch Models , 1995 .

[177]  F. Song,et al.  Bmc Musculoskeletal Disorders a Systematic Review of Outcomes Assessed in Randomized Controlled Trials of Surgical Interventions for Carpal Tunnel Syndrome Using the International Classification of Functioning, Disability and Health (icf) as a Reference Tool , 2022 .

[178]  B. Lindsay,et al.  Semiparametric Estimation in the Rasch Model and Related Exponential Response Models, Including a Simple Latent Class Model for Item Analysis , 1991 .

[179]  Gregory J. Kelly,et al.  Epistemic levels in argument: An analysis of university oceanography students' use of evidence in writing , 2002 .

[180]  A. Bankole,et al.  THE CONSISTENCY AND VALIDITY OF REPRODUCTIVE ATTITUDES: EVIDENCE FROM MOROCCO , 1998, Journal of Biosocial Science.

[181]  L. Zabin Ambivalent feelings about parenthood may lead to inconsistent contraceptive use--and pregnancy. , 1999, Family planning perspectives.

[182]  Jennifer A. Johnson-Hanks Demographic Transitions and Modernity , 2008 .

[183]  Neil Henry Latent structure analysis , 1969 .

[184]  Seoung-Hey Paik,et al.  K‐8th grade Korean students' conceptions of ‘changes of state’ and ‘conditions for changes of state’ , 2004 .

[185]  William Stout,et al.  A Multidimensionality-Based DIF Analysis Paradigm , 1996 .

[186]  Weimo Zhu A confirmatory study of Rasch-based optimal categorization of a rating scale. , 2002, Journal of applied measurement.

[187]  F. Kok,et al.  Item Bias and Test Multidimensionality , 1988 .

[188]  A. Glasier,et al.  Unintended pregnancy and use of emergency contraception among a large cohort of women attending for antenatal care or abortion in Scotland , 2006, The Lancet.

[189]  Wen-Chung Wang,et al.  Assessment of differential item functioning. , 2008, Journal of applied measurement.

[190]  A. Esacove Making sense of sex: rethinking intentionality , 2008, Culture, health & sexuality.

[191]  Klaas Sijtsma,et al.  Methodology Review: Evaluating Person Fit , 2001 .

[192]  William Stout,et al.  A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF , 1993 .

[193]  S. Natasha Beretvas,et al.  Longitudinal Rasch Modeling in the Context of Psychotherapy Outcomes Assessment , 2006 .

[194]  R. Zwick When Do Item Response Function and Mantel-Haenszel Definitions of Differential Item Functioning Coincide? , 1990 .

[195]  Kathleenl N. Lohr,et al.  Assessing health status and quality-of-life instruments: Attributes and review criteria , 2002, Quality of Life Research.

[196]  Eric T. Bradlow,et al.  A General Bayesian Model for Testlets: Theory and Applications , 2002 .

[197]  T. Joyce,et al.  On the validity of retrospective assessments of pregnancy intention , 2002, Demography.

[198]  Kate Wall,et al.  Interactive whole class teaching in the National Literacy and Numercy Strategies , 2004 .

[199]  I. Autti-Rämö,et al.  Effectiveness of physical therapy interventions for children with cerebral palsy: A systematic review , 2008, BMC pediatrics.

[200]  M. Meulders,et al.  A conceptual and psychometric framework for distinguishing categories and dimensions. , 2005, Psychological review.

[201]  Karen Draney,et al.  Mapping multiple dimensions of student learning: the ConstructMap program. , 2009, Journal of applied measurement.

[202]  R. Alexander Towards Dialogic Teaching: Rethinking Classroom Talk , 2008 .

[203]  K. Ross,et al.  Children's naive ideas about melting and freezing , 2003 .

[204]  Charles W. Anderson,et al.  Developing a multi-year learning progression for carbon cycling in socio-ecological systems , 2009 .

[205]  Ibrahim A. Halloun,et al.  Common sense concepts about motion , 1985 .

[206]  Robert B. Frary,et al.  Formula Scoring of Multiple‐Choice Tests (Correction for Guessing) , 1988 .

[207]  Gerhard H. Fischer,et al.  Linear Logistic Models for Change , 1995 .

[208]  Jean-Paul Fox,et al.  Bayesian modeling of measurement error in predictor variables using item response theory , 2003 .

[209]  Brian E. Clauser,et al.  Using Statistical Procedures to Identify Differentially Functioning Test Items , 2005 .

[210]  Mark R. Wilson,et al.  Towards Coherence between Classroom Assessment and Accountability: 103rd Yearbook of the National Society for the Study of Education, Part II , 2005 .

[211]  A. Hubbard,et al.  Predictive ability and stability of pregnancy intentions measures: a longitudinal analysis of adolescent boys and girls , 2010 .

[212]  A Matter of Test Bias in Educational Policy Research: Bringing the Context into Picture by Investigating Sociological/Community Moderated (or Mediated) Test and Item Bias. , 2005 .

[213]  Philip Johnson Children's Understanding of Changes of State Involving the Gas State, Part 1: Boiling Water and the Particle Theory. , 1998 .

[214]  Michael Shayer,et al.  Not just Piaget; not just Vygotsky, and certainly not Vygotsky as alternative to Piaget , 2003 .

[215]  Gerold Stucki,et al.  International Classification of Functioning, Disability, and Health (ICF): A Promising Framework and Classification for Rehabilitation Medicine , 2005, American journal of physical medicine & rehabilitation.

[216]  Gregory J. Cizek,et al.  Setting Performance Standards: Contemporary Methods , 2005 .

[217]  R. Linn Educational measurement, 3rd ed. , 1989 .

[218]  Gene V. Glass,et al.  Standards and Criteria* , 1978, Journal of MultiDisciplinary Evaluation.

[219]  W. Miller,et al.  A framework for modelling fertility motivation in couples , 2004, Population studies.

[220]  C. Kendall,et al.  Measuring factors underlying intendedness of women's first and later pregnancies. , 2004, Perspectives on sexual and reproductive health.

[221]  S. Messick Validity of Psychological Assessment: Validation of Inferences from Persons' Responses and Performances as Scientific Inquiry into Score Meaning. Research Report RR-94-45. , 1994 .

[222]  V. Corral-Verdugo A Structural Model of Proenvironmental Competency , 2002 .

[223]  W. Frontera,et al.  Publishing in physical and rehabilitation medicine. , 2008, American journal of physical medicine & rehabilitation.

[224]  G. Tutz Sequential item response models with an ordered response , 1990 .

[225]  Howard Wainer,et al.  Using a New Statistical Model for Testlets to Score TOEFL , 2000 .

[226]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[227]  Matthias von Davier,et al.  Measuring Growth in a Longitudinal Large-Scale Assessment with a General Latent Variable Model , 2011 .

[228]  Derek C. Briggs,et al.  Diagnostic Assessment With Ordered Multiple-Choice Items , 2006 .

[229]  G. Masters,et al.  Rating Scale Analysis. Rasch Measurement. , 1983 .

[230]  G Grimby,et al.  Scoring alternatives for FIM in neurological disorders applying Rasch analysis , 2005, Acta neurologica Scandinavica.

[231]  James W. Pellegrino,et al.  Chapter 9 : Addressing the “Two Disciplines” Problem: Linking Theories of Cognition and Learning With Assessment and Instructional Practice , 1999 .

[232]  R. Siegler Developmental Sequences within and between Concepts. , 1981 .

[233]  Kathleen Scalise,et al.  Mapping student understanding in chemistry: The Perspectives of Chemists , 2009 .

[234]  T. McDonald,et al.  The San Diego Striving Readers' Project: Building Academic Success for Adolescent Readers , 2009 .

[235]  D. A. Kenny,et al.  Correlation and Causation , 1937, Wilmott.

[236]  Klaas Sijtsma,et al.  Introduction to the measurement of psychological attributes , 2011 .

[237]  J. Linacre,et al.  Many-facet Rasch measurement , 1994 .

[238]  John Fox,et al.  TEACHER'S CORNER: Structural Equation Modeling With the sem Package in R , 2006 .

[239]  Mark Wilson,et al.  A Technique for Setting Standards and Maintaining Them over Time , 2002 .

[240]  Alan E Hubbard,et al.  Do changes in spousal employment status lead to domestic violence? Insights from a prospective study in Bangalore, India. , 2010, Social science & medicine.

[241]  Akihito Kamata,et al.  Item Analysis by the Hierarchical Generalized Linear Model. , 2001 .

[242]  L. Thurstone Attitudes Can Be Measured , 1928, American Journal of Sociology.

[243]  Kathleen Scalise,et al.  Differentiated e-learning: five approaches through instructional technology , 2007, Int. J. Learn. Technol..

[244]  Erling B. Andersen,et al.  Sufficient statistics and latent trait models , 1977 .

[245]  T. Joyce,et al.  The Stability of Pregnancy Intentions and Pregnancy-Related Maternal Behaviors , 2000, Maternal and Child Health Journal.

[246]  Jürgen Rost,et al.  Rasch Models in Latent Classes: An Integration of Two Approaches to Item Analysis , 1990 .

[247]  Florian G. Kaiser,et al.  Goal-directed conservation behavior: the specific composition of a general performance , 2004 .

[248]  J. Kagan The theoretical utility of constructs for self , 1991 .

[249]  S. Rabe-Hesketh,et al.  Generalized multilevel structural equation modeling , 2004 .

[250]  Stella Vosniadou,et al.  Mental Models of the Day/Night Cycle , 1994, Cogn. Sci..

[251]  Kathleen A. O'Neill,et al.  Item and test characteristics that are associated with differential item functioning. , 1993 .

[252]  K. Campbell,et al.  A conceptual model for interprofessional education: The international classification of functioning, disability and health (ICF) , 2006, Journal of interprofessional care.

[253]  Howard Wainer,et al.  Item Clusters and Computerized Adaptive Testing: A Case for Testlets , 1987 .

[254]  Florian G. Kaiser,et al.  Behavior-based environmental attitude : development of an instrument for adolescents , 2007 .

[255]  P. Black,et al.  Meanings and Consequences: a basis for distinguishing formative and summative functions of assessment? , 1996 .

[256]  J. Baxter Children's understanding of astronomy and the earth sciences. , 2012 .

[257]  J. Casterline,et al.  The estimation of Unwanted Fertility , 2007, Demography.

[258]  R. Bohrer Multiple Three-Decision Rules for Parametric Signs , 1979 .

[259]  C. Bledsoe,et al.  Reproductive mishaps and Western contraception: an African challenge to fertility theory. , 1998 .

[260]  M. Moos,et al.  Measuring the intensity of pregnancy planning effort. , 2003, Paediatric and perinatal epidemiology.

[261]  S. Chatterji,et al.  Comments from WHO for the Journal of Rehabilitation Medicine Special Supplement on ICF Core Sets. , 2004, Journal of rehabilitation medicine.

[262]  Icek Ajzen,et al.  From Intentions to Actions: A Theory of Planned Behavior , 1985 .

[263]  David A. Schum,et al.  Evidence and inference for the intelligence analyst , 1987 .

[264]  Mark Wilson Saltus: A psychometric model of discontinuity in cognitive development. , 1989 .

[265]  Juliet Popper Shaffer Bidirectional Unbiased Procedures , 1974 .

[266]  N. Dorans Further Comment: Freedle's Table 2: Fact or Fiction? , 2004 .

[267]  L. Skovgaard NONLINEAR MODELS FOR REPEATED MEASUREMENT DATA. , 1996 .

[268]  P. Boeck,et al.  Explanatory item response models : a generalized linear and nonlinear approach , 2004 .

[269]  J. Bruner The Culture of Education , 1996 .

[270]  Furong Gao,et al.  Investigating Local Dependence With Conditional Covariance Functions , 1998 .

[271]  Raymond J. Adams,et al.  Rasch models for item bundles , 1995 .

[272]  Noreen M. Webb,et al.  Small-Group Reflections: Parallels Between Teacher Discourse and Student Behavior in Peer-Directed Groups , 2006 .

[273]  William Stout,et al.  A nonparametric approach for assessing latent trait unidimensionality , 1987 .

[274]  S. Newcomer,et al.  Intended pregnancies and unintended pregnancies: distinct categories or opposite ends of a continuum? , 1999, Family planning perspectives.

[275]  R. Butler Task-involving and ego-involving properties of evaluation: Effects of different feedback conditions on motivational perceptions, interest, and performance. , 1987 .

[276]  D. Andrich The Rasch Model Explained , 2005 .

[277]  Sarah R. Crissey Effect of Pregnancy Intention on Child Well-Being and Development: Combining Retrospective Reports of Attitude and Contraceptive use , 2005 .

[278]  Mark Wilson,et al.  Unfair Treatment? The Case of Freedle, the SAT, and the Standardization Approach to Differential Item Functioning , 2010 .

[279]  M. Meulders,et al.  Cross-Classification Multilevel Logistic Models in Psychometrics , 2003 .

[280]  Scott Marion,et al.  Moving toward a Comprehensive Assessment System: A Framework for Considering Interim Assessments. , 2009 .

[281]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[282]  Terry E. Duncan,et al.  An Introduction to Latent Variable Growth Curve Modeling: Concepts, Issues, and Application, Second Edition , 1999 .

[283]  Pamela Joy Mulhall,et al.  What is the purpose of this experiment? Or can students learn something from doing experiments? , 2000 .

[284]  Harold W. Goldstein,et al.  Examining the Relationship Between Race-Based Differential Item Functioning and Item Difficulty , 2008 .

[285]  Cees A. W. Glas,et al.  Application of Multidimensional Item Response Theory Models to Longitudinal Data , 2006 .

[286]  R. Mislevy Linking Educational Assessments: Concepts, Issues, Methods, and Prospects. , 1992 .

[287]  Cynthia G. Parshall,et al.  Exact Versus Asymptotic Mantel-Haenszel DIF Statistics: A Comparison of Performance Under Small-Sample Conditions , 1995 .

[288]  F. Kaiser,et al.  Competence Formation in Environmental Education: Advancing Ecology-Specific Rather Than General Abilities , 2008 .

[289]  Ronald K. Hambleton,et al.  Small Sample Studies to Detect Flaws in Item Translations , 2001 .

[290]  R. Goldenberg,et al.  Changes in Intendedness During Pregnancy in a High-Risk Multiparous Population , 2000, Maternal and Child Health Journal.

[291]  Terry A. Ackerman A Didactic Explanation of Item Bias, Item Impact, and Item Validity from a Multidimensional Perspective , 1992 .

[292]  J. S. Long,et al.  Regression Models for Categorical and Limited Dependent Variables , 1997 .

[293]  Practical Formulations of the Latent Growth Item Response Model. , 2010 .

[294]  Karsten Schnack,et al.  The action competence approach in environmental education , 1997 .

[295]  Sun-Joo Cho,et al.  A Comparison of Item Calibration Procedures in the Presence of Test Speededness. , 2012 .

[296]  S. Bamberg,et al.  Twenty years after Hines, Hungerford, and Tomera: A new meta-analysis of psycho-social determinants of pro-environmental behaviour , 2007 .

[297]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[298]  F. Mayer,et al.  The connectedness to nature scale: A measure of individuals’ feeling in community with nature ☆ , 2004 .

[299]  Ralph W. Tyler,et al.  Basic Principles of Curriculum and Instruction , 1969 .

[300]  E. B. Andersen,et al.  CONDITIONAL INFERENCE FOR MULTIPLE‐CHOICE QUESTIONNAIRES , 1973 .

[301]  R. MacIntosh,et al.  Variance Estimation for Converting MIMIC Model Parameters to IRT Parameters in DIF Analysis , 2003 .

[302]  C. Tebé,et al.  Gender differences in health-related quality of life among the elderly: the role of objective functional capacity and chronic conditions. , 2006, Social science & medicine.

[303]  P. Tugwell,et al.  The World Health Organisation International Classification of Functioning, Disability and Health: a conceptual model and interface for the OMERACT process. , 2007, The Journal of rheumatology.

[304]  G. H. Fischer,et al.  The linear logistic test model as an instrument in educational research , 1973 .

[305]  John B. Willett,et al.  Using Covariance Structure Analysis to Model Change over Time , 2000 .

[306]  R. Shavelson,et al.  On the evaluation of systemic science education reform: Searching for instructional sensitivity , 2002 .

[307]  L. Guttman A basis for scaling qualitative data. , 1944 .

[308]  Daniel Bolt,et al.  DIFFERENTIAL ITEM FUNCTIONING: ITS MULTIDIMENSIONAL MODEL AND RESULTING SIBTEST DETECTION PROCEDURE , 1996 .

[309]  F. Y. Edgeworth I.—The Statistics of Examinations , 1888 .

[310]  T. B. Üstün,et al.  Development of ICF Core Sets for patients with chronic conditions. , 2004, Journal of rehabilitation medicine.

[311]  Derek C. Briggs,et al.  An introduction to multidimensional measurement using Rasch models. , 2003, Journal of applied measurement.

[312]  R. Glaser The Reemergence of Learning Theory within Instructional Research. , 1990 .

[313]  R. Mislevy,et al.  Marginal maximum likelihood estimation for a psychometric model of discontinuous development , 1996 .

[314]  Steven J. Osterlind,et al.  Constructing Test Items: Multiple-Choice, Constructed-Response, Performance and Other Formats , 2006 .

[315]  D. Borsboom,et al.  The Theoretical Status of Latent Variables , 2003 .

[316]  Randall D. Penfield Modeling DIF Effects Using Distractor-Level Invariance Effects: Implications for Understanding the Causes of DIF , 2010 .

[317]  T. Bedirhan Üstün,et al.  Application of the International Classification of Functioning, Disability and Health (ICF) in clinical practice , 2002, Disability and rehabilitation.

[318]  L. Cronbach Essentials of psychological testing , 1960 .

[319]  Mark Wilson,et al.  Constructing Measures: An Item Response Modeling Approach , 2004 .

[320]  R. Almond,et al.  Focus Article: On the Structure of Educational Assessments , 2003 .

[321]  J. Hagenaars,et al.  Applied Latent Class Analysis , 2003 .

[322]  D. Kuhn Science as argument : Implications for teaching and learning scientific thinking , 1993 .

[323]  M. Davison,et al.  Modeling Individual Differences in Numerical Reasoning Speed as a Random Effect of Response Time Limits , 2011 .

[324]  P. Black,et al.  Teachers developing assessment for learning: impact on student achievement , 2004 .

[325]  Dorothy T. Thayer,et al.  Differential Item Performance and the Mantel-Haenszel Procedure. , 1986 .

[326]  H. Kaiser,et al.  Directional statistical decisions. , 1960, Psychological review.

[327]  P. Bentler,et al.  Significance Tests and Goodness of Fit in the Analysis of Covariance Structures , 1980 .

[328]  Yeow Meng Thum,et al.  Setting Performance Standards: Concepts, Methods, and Perspectives , 2003 .

[329]  Allan S. Cohen,et al.  Item Parameter Estimation Under Conditions of Test Speededness: Application of a Mixture Rasch Model With Ordinal Constraints , 2002 .

[330]  Wim Van Den Noortgate,et al.  Assessing and Explaining Differential Item Functioning Using Logistic Mixed Models , 2005 .

[331]  R. Duschl,et al.  Strategies and Challenges to Changing the Focus of Assessment and Instruction in Science Classrooms. , 1997 .

[332]  J. Muñiz,et al.  Utility of the Mantel-Haenszel Procedure for Detecting Differential Item Functioning in Small Samples , 2004 .

[333]  Yilmaz Saglam,et al.  Middle school students' beliefs about matter , 2005 .

[334]  Margaret Wu The development and application of a fit test for use with marginal maximum likelihood estimation and generalised item response models , 1997 .

[335]  Shawn Y. Stevens,et al.  Developing a Hypothetical Multi-Dimensional Learning Progression for the Nature of Matter. , 2009 .

[336]  F. Samejima Estimation of latent ability using a response pattern of graded scores , 1968 .

[337]  R. Mislevy,et al.  Psychometric Principles in Student Assessment. CSE Technical Report. , 2002 .

[338]  A. Gopnik,et al.  The theory theory. , 1994 .

[339]  Jin-Yi Chang Teachers college students' conceptions about evaporation, condensation, and boiling , 1999 .

[340]  D. Allen,et al.  Improving measurement in health education and health behavior research using item response modeling: introducing item response modeling. , 2006, Health education research.

[341]  J. Biggs,et al.  Teaching For Quality Learning At University , 1999 .

[342]  N. Brown,et al.  A Model of Cognition: The Missing Cornerstone of Assessment , 2011 .

[343]  Roger E. Millsap,et al.  On the misuse of manifest variables in the detection of measurement bias , 1992 .

[344]  F. Lord Applications of Item Response Theory To Practical Testing Problems , 1980 .

[345]  Howard Wainer,et al.  Use of item response theory in the study of group differences in trace lines. , 1988 .

[346]  Alipaşa Ayas,et al.  Evaporation in different liquids: secondary students’ conceptions , 2005 .

[347]  David M. Williams,et al.  Accounting for Statistical Artifacts in Item Bias Research , 1984 .

[348]  J. Scheuneman A METHOD OF ASSESSING BIAS IN TEST ITEMS , 1979 .

[349]  Person regression models , 2004 .

[350]  Igal Galili,et al.  Stages of children's views about evaporation , 1994 .

[351]  R. Freedle Correcting the SAT's ethnic and social-class bias: A method for reestimating SAT scores. , 2003 .

[352]  P. Hewson,et al.  Accommodation of a scientific conception: Toward a theory of conceptual change , 1982 .

[353]  Steven J. Ingels,et al.  Base-Year to Fourth Follow-up Data File User's Manual. National Education Longitudinal Study of 1988. NCES 2002-323. , 2002 .

[354]  Mark Wilson,et al.  Validating a Learning Progression in Mathematical Functions for College Readiness , 2011 .

[355]  A. Cohen,et al.  The Role of Extended Time and Item Content on a High–Stakes Mathematics Test , 2005 .

[356]  Edward Haksing Ip,et al.  Empirically indistinguishable multidimensional IRT and locally dependent unidimensional item response models. , 2010, The British journal of mathematical and statistical psychology.

[357]  Raymond J. Adams,et al.  The Multidimensional Random Coefficients Multinomial Logit Model , 1997 .

[358]  Cees A. W. Glas,et al.  DETECTION OF DIFFERENTIAL ITEM FUNCTIONING USING LAGRANGE MULTIPLIER TESTS , 1996 .

[359]  D. Andrich Application of a Psychometric Rating Model to Ordered Categories Which Are Scored with Successive Integers , 1978 .

[360]  S. Hall,et al.  The construct of internalization: conceptualization, measurement, and prediction of smoking treatment outcome , 2005, Psychological Medicine.

[361]  R. Freedle On Replicating Ethnic Test Bias Effects: The Santelices and Wilson Study. , 2010 .

[362]  B. Junker,et al.  Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory , 2001 .

[363]  Georg Rasch,et al.  Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[364]  Mark R. Wilson The role of mathematical models in measurement: a perspective from psychometrics , 2011 .

[365]  Meryl W. Bertenthal,et al.  Uncommon Measures: Equivalence and Linkage among Educational Tests. , 1999 .

[366]  Kenneth A. Bollen,et al.  Structural Equations with Latent Variables , 1989 .

[367]  Wen-Chung Wang,et al.  Effects of Average Signed Area Between Two Item Characteristic Curves and Test Purification Procedures on the DIF Detection via the Mantel-Haenszel Method , 2004 .

[368]  R. Almond,et al.  A BRIEF INTRODUCTION TO EVIDENCE-CENTERED DESIGN , 2003 .

[369]  T. K. Roy,et al.  Do current measurement approaches underestimate levels of unwanted childbearing? Evidence from rural India , 2006, Population studies.

[370]  Alan M Jette,et al.  Blending activity and participation sub-domains of the ICF , 2007, Disability and rehabilitation.

[371]  Robert J. Mislevy,et al.  Technology Supports for Assessment Design , 2010 .

[372]  Alija Kulenović,et al.  Standards for Educational and Psychological Testing , 1999 .

[373]  R. Freedle,et al.  A COMPARISON OF STRATEGIES USED BY BLACK AND WHITE STUDENTS IN SOLVING SAT VERBAL ANALOGIES USING A THINKING ALOUD METHOD AND A MATCHED PERCENTAGE-CORRECT DESIGN , 1987 .

[374]  R. Glaser,et al.  Knowing What Students Know: The Science and Design of Educational Assessment , 2001 .

[375]  David Thissen,et al.  Beyond group-mean differences: The concept of item bias. , 1986 .

[376]  M. Chren,et al.  Testing and reducing skindex-29 using Rasch analysis: Skindex-17. , 2006, The Journal of investigative dermatology.

[377]  R. Millsap,et al.  Evaluating the impact of partial factorial invariance on selection in two populations. , 2004, Psychological methods.

[378]  Paul F. Lazarsfeld,et al.  Latent Structure Analysis. , 1969 .

[379]  Mark R. Wilson,et al.  Running Head: Measurement at the Knowledge Level. A Theory of the Measurement of Knowledge Content, Access, and Learning. , 1996 .

[380]  Faith K. Greulich,et al.  Meeting goals and confronting conflict: children's changing perceptions of social comparison. , 1995, Child development.

[381]  Raymond J. Adams,et al.  Multilevel Item Response Models: An Approach to Errors in Variables Regression , 1997 .

[382]  Anton K. Formann,et al.  Linear Logistic Latent Class Analysis and the Rasch Model , 1995 .

[383]  Marcel A. Croon,et al.  Latent class analysis with ordered latent classe , 1990 .

[384]  J. Stanford,et al.  Are all contraceptive failures unintended pregnancies? Evidence from the 1995 National Survey of Family Growth. , 1999, Family planning perspectives.

[385]  S. Erduran,et al.  TAPping into argumentation: Developments in the application of Toulmin's Argument Pattern for studying science discourse , 2004 .

[386]  S. J. Sinclair,et al.  Activity Outcome Measurement for Postacute Care , 2004, Medical care.

[387]  Decomposition of a Rasch partial credit item into independent binary and indecomposable trinary items , 1996 .

[388]  Mark Wilson,et al.  Improving measurement in health education and health behavior research using item response modeling: comparison with the classical test theory approach. , 2006, Health education research.

[389]  G. H. Fischer,et al.  Logistic latent trait models with linear constraints , 1983 .

[390]  Russell G. Almond,et al.  A Four-Process Architecture for Assessment Delivery, with Connections to Assessment Design , 2002 .

[391]  Franz Emanuel Weinert,et al.  Concept of competence: A conceptual clarification , 2001 .

[392]  Saul Geiser,et al.  UC and the SAT: Predictive Validity and Differential Impact of the SAT I and SAT II at the University of California , 2001 .

[393]  S. Raudenbush,et al.  Comparing personal trajectories and drawing causal inferences from longitudinal data. , 2001, Annual review of psychology.

[394]  B. Edwards,et al.  Internal consistency and validity of the Stroke Impact Scale 2.0 (SIS 2.0) and SIS-16 in an Australian sample , 2003, Quality of Life Research.

[395]  Ben Kelcey,et al.  How and when does complex reasoning occur? Empirically driven development of a learning progression focused on complex reasoning about biodiversity , 2009 .

[396]  Howard Wainer,et al.  The Rasch Model as Additive Conjoint Measurement , 1979 .

[397]  J. Fleishman,et al.  Differential item functioning and health assessment , 2007, Quality of Life Research.

[398]  Sally Brown,et al.  Assessment for Learning , 2005 .

[399]  D. Olson,et al.  The Handbook of education and human development : new models of learning, teaching, and schooling , 1996 .

[400]  Gaea Leinhardt,et al.  Functions, Graphs, and Graphing: Tasks, Learning, and Teaching , 1990 .

[401]  Rebecca Zwick,et al.  Fair Game?: The Use of Standardized Admissions Tests in Higher Education , 2002 .

[402]  Robert J. Mislevy,et al.  Test Theory Reconceived , 1996 .

[403]  A. Satorra,et al.  Complex Sample Data in Structural Equation Modeling , 1995 .

[404]  R. Siegler Three aspects of cognitive development , 1976, Cognitive Psychology.

[405]  F. Kaiser,et al.  One for All? Connectedness to Nature, Inclusion of Nature, Environmental Identity, and Implicit Association with Nature , 2011 .

[406]  F. J. Carod-Artal,et al.  Functional recovery and instrumental activities of daily living: follow-up 1-year after treatment in a stroke unit , 2002, Brain injury.

[407]  Benjamin S. Bloom,et al.  Taxonomy of Educational Objectives: The Classification of Educational Goals. , 1957 .

[408]  Torsten Husén,et al.  The international encyclopedia of education : research and studies , 1985 .

[409]  S. Embretson,et al.  Item response theory for psychologists , 2000 .

[410]  M. Patton,et al.  Qualitative evaluation methods , 1981 .

[411]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[412]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.